Learning notes
Reference Book: Econometrics
Trend stationary and differential stationary
Trend stationary and differential stationary stochastic processes
- False return
Some non-stationary economic time series often show a common trend of change, but there is not necessarily a direct correlation between these series. In this case, the regression of these data, although with a higher R2R^2R2, has no practical significance. This phenomenon is called false regression.
In order to avoid the generation of such false regression, the common practice is to introduce the time TTT as the trend variable, so that the regression with the time trend variable can eliminate the influence of the trend. This, however, works only if the trend variable is deterministic rather than random. In other words, a nonstationary time series containing certain deterministic trends can be separated by introducing trend variables.
- Random trend and deterministic trend
Consider the following random process with order 1 autoregression:
Where μt\mu_tμt is white noise, TTT is a time trend.
If the rho = 1, beta = 0 \ rho = 1, \ beta = 0 rho = 1, beta = 0, (1) as a random walk process with displacement. XtX_tXt shows a significant upward or downward trend according to the positive or negative value of α\alphaα, which is called a stochastic trend.
If ρ=0,β≠0\rho=0, \beta \neq0ρ=0,β=0, then equation (1) is a random process with time trend. According to the positive and negative of β\betaβ, XtX_tXt shows an obvious upward or downward trend, which is called a deterministic trend.
If ρ=1,β≠0\rho=1, \beta \neq0ρ=1,β=0, XtX_tXt contains both deterministic and stochastic trends.
The third model used in ADF test can be used to judge whether a non-stationary time series has random or deterministic trends:
If the test results show that the given time series has a unit root and the parameter before the time variable TTT is significantly zero, then the series shows a randomness trend. If there is no unit root and the parameter before the time variable TTT is significantly different from zero, the sequence shows a deterministic trend.
- Differential stationary process and trend stationary process
Random trend can be eliminated by difference method. For example, a random walk process with displacement can be stabilized by difference method:
This time series XtX_tXt is called differential stationary process.
However, the deterministic trend cannot be eliminated by the difference method, and can only be eliminated by removing the trend term. For example, a random change process with time trend can be made stable by removing the time variable TTT:
This time series XtX_tXt is called trend stationary process. Trend stationary process represents a long-term stable change process of time series, so it is more reliable for long-term prediction.
R language implementation
Next, two time series will be simulated to determine which trend (random or deterministic) the series has through ADF test.
Simulation sequence (random walk) :
Simulate a random walk
set.seed(1238)
n <- 1000
y <- vector(length = n)
y[1] = 0
for (i in 2:n){
y[i] = y[i-1] + rnorm(1.0.1/n)
}
# Use model 3 in ADF test to obtain the significance test results of time variable TT
x <- y
k <- 1
x <- as.vector(x, mode = "double")
x_1 <- diff(x)
n <- length(x_1)
z <- embed(x_1, k)
yt <- z[, 1]
xt1 <- x[k:n]
tt <- k:n
res <- lm(yt ~ xt1 + 1 + tt)
summary(res)$coefficients
#ADF test results
adf.test(y)$p.value
Copy the code
Console output:
> the summary (res) $coefficients Estimate Std. Error t value Pr (> | | t) (Intercept) to 3.512132 e-05 e-05 6.686299 to 0.5252729 0.59951037 xT1-4.570506E-03 3.345454E-03-1.3661843 0.17218951 TT 2.859225e-07 1.665257e-07 1.7169867 0.08629248 > Adf test (y) $p.v alue [1] of 0.8823519Copy the code
As can be seen from the results, the regression coefficient of time variable TT at a significance level greater than 0.05 cannot reject the hypothesis that the regression coefficient is 0, indicating that there is no deterministic trend, while the result of ADF unit root test shows that there is a unit root in this sequence, indicating a random trend.
Simulation sequence:
n <- 100
x <- c(1:n)
z <- rnorm(n, 0.1/n)
y <- 2*x + 3 + z
# Use model 3 in ADF test to obtain the significance test results of time variable TT
x <- y
k <- 1
x <- as.vector(x, mode = "double")
x_1 <- diff(x)
n <- length(x_1)
z <- embed(x_1, k)
yt <- z[, 1]
xt1 <- x[k:n]
tt <- k:n
res <- lm(yt ~ xt1 + 1 + tt)
summary(res)$coefficients
#ADF test results
adf.test(y)$p.value
Copy the code
Console output:
> the summary (res) $coefficients Estimate Std. Error t value Pr (> | | t) 4.7066541 0.3032464 15.520891 6.454279 e-28 (Intercept) Xt1-0.9024946 0.1010956-8.927142 2.998167e-14 TT 1.8050169 0.2021935 8.927177 2.997647e-14 > ADf. test(y)$p.value [1] 0.01Copy the code
It can be seen from the results that the regression coefficient of time variable TT, at a significance level less than 0.05, rejects the hypothesis that the regression coefficient is 0, indicating the existence of a deterministic trend, while the ADF unit root test results show that there is no unit root in this series, so there is no random trend.