Original link:tecdat.cn/?p=15062
Original source:Tuo End number according to the tribe public number
Consider a simple Poisson regression. Given sample, including, the goal is to derive for a 95% confidence intervalgiven, includingIs predicted.
Therefore, we derive the predicted confidence interval rather than the observed value, which is the point below
> r=glm(dist~speed,data=cars,family=poisson) > P=predict(r,type="response", + newdata = data frame (speed = seq (1, 35, by = 2))) > plot (cars, xlim = c (0, 31), ylim = c (0170)) > abline (v = 30, lty = 2) > Lines (seq (1, 35, by = 2), P, LWD = 2, col = "red" > P0 = predict (r, type = "response", se, fit = TRUE, + newdata=data.frame(speed=30)) > points(30,P1$fit,pch=4,lwd=3)Copy the code
namely
Maximum likelihood estimation.
Fisher information comes from standard maximum likelihood theory.
These values are calculated based on the following calculations
In the case of logarithmic Poisson regression,
Let’s go back to the original question.
- Confidence intervals for linear combinations
So the first idea of getting a confidence interval is to get a confidence interval(by taking the exponent value of the boundary). Asymptotically, we know that
Therefore, the approximation of the variance matrix will be based on the estimators of the interpolated parameters. Then, since being an asymptotic multivariate distribution, any linear combination of the parameters will also be normal, i.e. have a normal distribution. All of these quantities can be easily calculated. First, we can get the variance of the estimator
So if we compare it to the output of the regression,
> summary(reg)$cov.unscaled (Intercept) speed (Intercept) 0.0066870446-3.474479E-04 speed-0.0003474479 1.940302e-05 > V [,1] [,2] [1,] 0.0066871228-3.474515E-04 [2,] -0.0003474515 1.940318e-05Copy the code
From these values, it’s easy to figure out the standard deviation of a linear combination,
Once we have the standard deviation and normality, we have our confidence interval, and then we take the exponent of the boundary, and we have our confidence interval
> segments (30, exp (P2 $fit - P2 $1.96 * se. Fit), + 30, exp (P2 P2 $$fit + 1.96 * se. Fit), col = "blue", LWD = 3)Copy the code
Based on this technique, confidence intervals are no longer prediction-centered.
- The incremental method
In fact, using expressions as confidence intervals doesn’t like non-central intervals. Therefore, an alternative is to use an incremental approach. Instead of writing something in theory again, we can use a package to calculate the method,
> P1 $fit 1 155.4048 $se.fit 1 8.931232 $residue.scale [1] 1Copy the code
The increment method gives us normality, so once we have the standard deviation, we can get the confidence interval.
The quantities obtained by the two different methods are very close here
> exp(P2$fit-1.96*P2$se.fit) 1 138.8495 > P1$fit-1.96*P1$se.fit 1 137.8996 > exp(P2$fit+1.96*P2$se.fit) 1 173.9341 > P1 $fit + 1.96 * P1 $se. Fit 1, 172.9101Copy the code
- The bootstrap technology
The third method uses bootstrap technique to derive these results based on asymptotic normality (only 50 observations). The idea is to take samples from the data set and perform log-poisson regression on these new samples many times,
reference
1. Use SPSS to estimate the HLM hierarchical linear model
2. Linear Discriminant Analysis (LDA), Quadratic Discriminant Analysis (QDA) and Regular Discriminant Analysis (RDA) of R language
3. Lmer mixed linear regression model based on R language
4. Simple Bayesian linear regression simulation analysis of Gibbs sampling in R language
5. Use GAM (Generalized additive Model) to analyze power load time series in R language
6. Hierarchical linear model HLM using SAS, Stata, HLM, R, SPSS and Mplus
Ridge regression, lasso regression, principal component regression in R language: Linear model selection and regularization
8. Prediction of air quality ozone data by linear regression model in R language
9.R language hierarchical linear model case