Original link:tecdat.cn/?p=20531
Original source:Tuo End number according to the tribe public number
In the standard linear model, we assume that. When the linear hypothesis cannot be satisfied, other methods can be considered.
-
Polynomial regression
The extension might be assuming some polynomial function,
Similarly, in the standard linear model method (conditional normal distribution using GLM), the parametersCan be obtained using the least square method, where 在 。
Even if this polynomial model is not a true polynomial model, it may still be a good approximation. In fact, according toStone – Weierstrass theoremIf theIf it is continuous on some interval, there is a uniform approximationBy a polynomial function.
For illustration only, consider the following data set
db = data.frame(x=xr,y=yr)
plot(db)
Copy the code
And the standard regression line
reg = lm(y ~ x,data=db)
abline(reg,col="red")
Copy the code
Consider some polynomial regression. If the degree of polynomial function is large enough, you can get any kind of model,
reg=lm(y~poly(x,5),data=db)
Copy the code
But too many times and you get too much “fluctuation,”
reg=lm(y~poly(x,25),data=db)
Copy the code
And estimates can be unreliable: if we change a point, a (local) change may occur
yrm=yr; yrm[31]=yr[31]-2 lines(xr,predict(regm),col="red")Copy the code
-
Local regression
In fact, if we’re interested in the local there’s a good approximationWhy not use local regression?
This can easily be done using weighted regression, in the least square formula we consider
- I’m thinking about linear models here, but I can think about any polynomial model. In this case, the optimization problem is
It can be solved because
For example, if we want to predict at some point, consider. Using this model, we can delete observations that are too far away,
A more general idea is to consider some kernel functionsGive the weight function, and give some bandwidth (usually expressed as H) for the neighborhood length,
That’s actually what it’s calledNadaraya-WatsonFunction estimator.
In the previous case, we considered the unified core.
But using this weight function which is very discontinuous is not the best choice, try the Gaussian kernel,
This can be used
w=dnorm((xr-x0))
reg=lm(y~1,data=db,weights=w)
Copy the code
On our data set, we can plot
w=dnorm((xr-x0)) plot(db,cex=abs(w)*4) lines(ul,vl0,col="red") axis(3) axis(2) reg=lm(y~1,data=db,weights=w) U = seq (0, 10, by =. 02) v = predict (reg, newdata = data frame (x = u)) lines (u, v, col = "red", LWD = 2)Copy the code
Here, we need a local regression at point 2. The horizontal line below is regressive (the size of the point is proportional to its width). The red curve is the evolution of local regression
Let’s use animation to visualize the curve.
But for some reason, I couldn’t easily install the package on Linux. We can use loops to generate some graphics
PNG (name,600,400) for(I in 1:length(vx0)) graph (I)Copy the code
Then, I use
Of course, you can think about local linear models,
return(predict(reg,newdata=data.frame(x=x0)))}
Copy the code
Even quadratic (local) regression,
lm(y~poly(x,degree=2), weights=w)
Copy the code
Of course, we can change the bandwidth
Note that we actually have to choose the weight function (so called kernel). However, there are (easy) ways to choose the “best” bandwidth h. The idea of cross-validation is to consider
Is a prediction obtained using local regression.
We can try some real data.
library(XML)
data = readHTMLTable(html)
Copy the code
Collating data sets,
Plot ($no data, data $mu, ylim = c (6, 10)) segments ($no data, data * data $$mu - 1.96 se,Copy the code
We calculate the standard error to reflect the uncertainty.
for(s in 1:8){reg=lm(mu~no,data=db,
lines((s predict(reg)[1:12]
Copy the code
It is not a good assumption that all seasons should be considered completely independent.
smooth(db$no,db$mu,kernel = "normal",band=5)
Copy the code
We can try to look at a curve with a larger bandwidth.
db$mu[95]=7
plot(data$no,data$mu
lines(NW,col="red")
Copy the code
The spline smoothing
Next, we discuss smoothing methods in regression. Assuming that , It’s some unknown function, but it’s supposed to be smooth enough. For example, supposeIt’s continuous,Exists and is continuous,Being and being continuous and so on. ifSmooth enough to useTaylor expansion.Therefore, for
Or you could write it as
The first part is just a polynomial.
Using Riemann integrals, observe that
As a result,
We have linear regression models. A natural idea is to consider regressionfor
Give some nodes.
plot(db)
Copy the code
If we take a node and extend the order by 1,
B = bs (xr, knots = c (3), a Boundary. The knots = c (0, 10), degre = 1) lines (xr [xr < = 3], predict (reg) [xr < = 3], col = "red") lines(xr[xr>=3],predict(reg)[xr>=3],col="blue")Copy the code
Predictions obtained with this spline can be compared with regressions on a subset (dashed line).
lines(xr[xr<=3],predict(reg)[xr<=3
lm(yr~xr,subset=xr>=3)
Copy the code
This is different because here we have three parameters (regression of two subsets). One degree of freedom is lost when the continuous model is required. Observe that it can be written equivalently
Lm (yr ~ bs (xr, knots = c (3), a Boundary. The knots = c (0, 10)Copy the code
The functions that appear in the regression are as follows
Now, if we go back to these two components, we get
Matplot (xr, B abline (v = c (0,2,5,10), lty = 2)Copy the code
If we add a node, we get
Prediction is
lines(xr,predict(reg),col="red")
Copy the code
We can choose more nodes
lines(xr,predict(reg),col="red")
Copy the code
We can get a confidence interval
polygon(c(xr,rev(xr)),c(P[,2],rev(P[,3]))
points(db)
Copy the code
If we keep the two nodes we chose earlier, but consider Taylor’s expansion of order 2, we get
Matplot (xr, B type = "l") abline (v = c (0,2,5,10), lty = 2)Copy the code
If we look at constants and based on the first part of the spline, we get
B=cbind(1,B)
lines(xr,B[,1:k]%*%coefficients(reg)[1:k],col=k-1,lty=k-1)
Copy the code
If we add the constant term, the first term and the second term, then we get something on the left before the first node,
k=3
lines(xr,B[,1:k]%*%coefficients(reg)[1:k]
Copy the code
By using three terms in a spline-based matrix, we can get the part between the two nodes,
lines(xr,B[,1:k]%*%coefficients(reg)[1:k]
Copy the code
And finally, when we sum them, this time the right-hand side after the last node,
k=5
Copy the code
This is what we get using quadratic spline regression with two (fixed) nodes. I can get the confidence interval just like I did before
polygon(c(xr,rev(xr)),c(P[,2],rev(P[,3]))
points(db)
lines(xr,P[,1],col="red")
Copy the code
Using the function, can ensure the continuity of points.
Again, using linear spline functions, you can add continuity constraints,
Lm (mu ~ bs (no knots = c (12 * (1:7) +. 5), a Boundary. The knots = c (0,97), lines (c (1:9) 4 with just, predict (reg), col = "red")Copy the code
But we can also think about quadratic splines,
Abline (n = 12 * (0:8) +. 5, lty = 2) lm (mu ~ bs (no knots = c (12 * (1:7) +. 5), a Boundary. The knots = c (0,97),Copy the code
Most welcome insight
1.R language multiple Logistic Logistic regression application case
2. Panel smooth transfer regression (PSTR) analysis case implementation
3. Partial least squares regression (PLSR) and principal component regression (PCR) in MATLAB
4.R language Poisson regression model analysis cases
5. Hosmer-lemeshow goodness of fit test in R language regression
6. Implementation of LASSO regression, Ridge regression and Elastic Net model in R language
7. Realize Logistic Logistic regression in R language
8. Python predicts stock prices using linear regression
9. How to calculate IDI and NRI indices for R language in survival analysis and Cox regression