Link to original article:tecdat.cn/?p=22511
Original source:Tuoduan numbers according to the public account of the tribe
The standard ARIMA (Moving Average Autoregression Model) model allows predictions to be made based only on past values of predicted variables. The model assumes that the future value of a variable depends linearly on its past value, as well as on the value of past (random) influences. The ARIMAX model is an extended version of the ARIMA model. It also includes other independent (predictive) variables. This model is also known as vector ARIMA or dynamic regression model.
ARIMAX models are similar to multivariable regression models, but allow for the use of possible autocorrelation in the regression residials to improve the accuracy of predictions. This exercise provides an exercise in performing ARIMAX model prediction. The regression coefficients were also checked for statistical significance.
These exercises use ice cream consumption data. The data set contains the following variables.
- Ice cream Consumption in the United States (per capita)
- Average weekly household income
- The price of ice cream
- Average temperature.
The number of observed data is 30. They correspond to a four-week period between March 18, 1951, and July 11, 1953.
Exercise 1
Load the data set and plot the variables CONS (ice cream consumption), Temp (temperature), and income.
Ggplot (df, AES (x = x, y = income)) + ylab(" income ") + Xlab (" time ") + Grid. Arrange (P1, p2, p3, Ncol =1, Nrow =3)Copy the code
practice2
Estimate ARIMA model for ice cream consumption data. Then the model is passed to the prediction function as input to get the prediction data of the next 6 periods.
auto.arima(cons)
Copy the code
fcast_cons <- forecast(fit_cons, h = 6)
Copy the code
Practice 3
The prediction map is drawn.
Exercise 4
The mean absolute error (MASE) of the fitted ARIMA model is found.
accuracy
Copy the code
Practice 5
Estimate an extended ARIMA model for the consumption data, using the temperature variable as an additional regression factor (using the auto.arima function). The next six periods are projected (note that this projection requires assumptions about expected temperatures; It is assumed that the temperature in the next six periods will be represented by the following vector:
Fcast_temp < -c (70.5, 66, 60.5, 45.5, 36, 28)Copy the code
Draw the prediction map obtained.
Exercise 6
Output a summary of the predictions obtained. Find the coefficient of the temperature variable, its standard error, and the predicted MASE. The MASE was compared to the initially predicted MASE.
summary(fca)
Copy the code
The coefficient of the temperature variable is 0.0028
The standard error for this coefficient is 0.0007
The average absolute proportional error is 0.7354048, which is smaller than that of the initial model (0.8200619).
Exercise 7
Check the statistical significance of the temperature variable coefficients. Is the coefficient statistically significant at the 5% level?
test(fit)
Copy the code
8
The functions that estimate the ARIMA model can input more additional regression factors, but only in the form of matrices. Create a matrix with the following columns.
Value of the temperature variable. The value of the income variable. The value of an income variable with a lag of one period. The value of the income variable with a lag of two periods. Output the matrix. Note: The last three columns can be created by adding two NA’s to the vector of the value of the income variable and taking the resulting vector as input to the embedded function (the dimension parameter is equal to the number of columns to be created).
vars <- cbind(temp, income)
print(vars)
Copy the code
Practice 9
The obtained matrices were used to fit the three extended ARIMA models, using the following variables as additional regression factors.
Temperature, income. The lag period of temperature and income is 0 and 1. Temperature, lag period is 0, 1, 2 income. Review the summary of each model and find the model with the lowest information criterion (AIC) value. Note that AIC cannot be used to compare ARIMA models with different orders because the number of observations varies. For example, the AIC value of the non-difference model ARIMA (p, 0, q) cannot be compared with the corresponding value of the difference model ARIMA (p, 1, q).
auto.arima(cons, xreg = var)
print(fit0$aic)
Copy the code
AIC can be used because the parameter order of each model is the same (0).
The model with the lowest AIC value is the first model.
Its AIC is equal to -113.3.
Practice 10
Use the model found in the previous exercise to make predictions over the next six periods and draw a prediction map. Forecasting requires a matrix of expected temperature and income over the next six periods; Create the matrix using the temp variable and the following expected income values: 91, 91, 93, 96, 96, 96. Find out the mean absolute scaling error of this model and compare it with the errors of the first two models in this exercise set.
The model with two external regression factors has the lowest mean absolute proportional error (0.528)
Most popular insight
1. Use LSTM and PyTorch for time series prediction in Python
2. Use LSTM to predict and analyze time series in Python
3. Use R language to analyze time series (ARIMA, exponential smoothing)
4. Multiple Copula-GARCH-model time series prediction in R language
5. R language Copulas and financial time series cases
6. Use R language random wave model SV to deal with random wave in time series
7. Tar threshold autoregression model of r language time series
8. R language K-shape time series clustering method to stock price time series clustering
9. Python3 uses arIMA model for time series prediction