Original link:tecdat.cn/?p=3364
Original source:Tuo End number according to the tribe public number
Load R packages and data sets
After loading the package, we subset the 12 mood variables contained in this dataset:
mood_data <- as.matrix(symptom_data$data[, None of these lists exists, and none of these lists exists. None of these lists exists, and none of these lists exists. None of these lists exists, and none of these lists exists symptom_data$data_timeCopy the code
The mood_data object is a 1476×12 matrix that measures 12 mood variables:
> dim(mood_data) [1] 1476 12 > head(mood_data[,1:7]) Relaxed Down Irritated Satisfied Lonely Anxious Enthusiastic [1,] 5 -1 1 5-1-1 4 [2,] 4 0 3 3 0 3 [3,] 4 0 2 3 0 4 [4,] 4 0 1 4 0 4 [5,] 4 0 2 4 0 4 [6,] 5 0 1 4 0 3Copy the code
Time_data contains information about the timestamp of each measurement. This information is required for data preprocessing.
> head(time_data)
date dayno beepno beeptime resptime_s resptime_e time_norm
1 13/08/12 226 1 08:58 08:58:56 09:00:15 0.000000000
2 14/08/12 227 5 14:32 14:32:09 14:33:25 0.005164874
3 14/08/12 227 6 16:17 16:17:13 16:23:16 0.005470574
4 14/08/12 227 8 18:04 18:04:10 18:06:29 0.005782097
5 14/08/12 227 9 20:57 20:58:23 21:00:18 0.006285774
6 14/08/12 227 10 21:54 21:54:15 21:56:05 0.006451726
Copy the code
Some of the variables in this data set are highly skewed, which can lead to unreliable parameter estimates. Here, we deal with this problem by calculating bootstrap confidence intervals (KS method) and confidence intervals (GAM method) to judge the reliability of the estimates. Since this tutorial focuses on estimating time-varying VAR models, we will not examine variable skewness in detail. In practice, however, the marginal distribution should always be checked before fitting the (time-varying) VAR model.
Estimate the time-varying VAR model
Lags 1 VAR model was specified by lags = 1 and the parameter λ with cross-validation was selected by lambdaSel = “CV”. Finally, using the parameter scale = TRUE, we specify that all variables should be normalized before model fitting. This is recommended when “1 regularization” is used, because otherwise the strength of the parameter penalty depends on the variance of the prediction variable. Since the cross-validation scheme is defined using random extraction, we set the seed to ensure repeatability.
Before looking at the results, we checked how many of the 1476 time points were used for estimation, as shown in the summary of the call output object
> tvvar_obj mgm fit-object Model class: Time-varying mixed Vector Autoregressive (tv-mVAR) model Lags: 1 Rows included in VAR design matrix: 876/1475 (59.39%) Nodes: 12 Estimation points: 20Copy the code
The absolute value of the estimated VAR coefficient is stored in the object tvVAR_obj $Wadj, which is an array of dimensions P × P × lag × ESTPoints.
Reliability of parameter estimation
res_obj <- resample(object = tvvar_obj,
data = mood_data,
nB = 50,
blocks = 10,seeds = 1:50,
quantiles = c(.05, .95))
Copy the code
Res_obj $bootParameters contains the empirical sample distribution for each parameter.
Time – varying prediction error is calculated
Function predict () calculates the prediction and prediction error of a given MGM model object.
The prediction is stored in pred_obj $prediction, and the prediction errors of all time-varying models are combined in pred_obj:
> Pred_obj $errors Variable error.rmse error.r2 1 Relaxed 0.939 0.155 2 Down 0.825 0.297 3 Lost for highway 0.942 0.119 4 7 Enthusiastic 0.922 0.169 8 Suspicious 0.818 0.247 9 Cheerful 0.889 0.200 10 Guilty 0.928 0.175 11 Doubt 0.871 0.268 12 Strong 0.896 0.195Copy the code
Visualize time-varying VAR models
Visualize a portion of the VAR parameters estimated above over time:
Q < -qgraph (t(mean_wadj), DoNotPlot=TRUE) saveRDS(Q$layout, Q$layout) TpSelect < -c (2, 10, 18) # tvvar_obj$edgecolor[,, ][tvvar_obj$edgecolor[, , , ] == "darkgreen"] <- c("darkblue") lty_array <- array(1, dim=c(12, 12, 1, 20)) lty_array[tvvar_obj$edgecolor[, , , ] ! = "darkblue"] <- 2 for(tp in tpSelect) { qgraph(t(tvvar_obj$wadj[, , 1, tp]), layout = Q$layout, edge.color = t(tvvar_obj$edgecolor[, , 1, tp]), labels = mood_labels, vsize = 13, esize = 10, asize = 10, mar = rep(5, 4), minimum = 0, maximum = .5, lty = t(lty_array[, , 1, tp]), pie = pred_obj$tverrors[[tp]][, 3]) }Copy the code
CIs <- apply(res_obj$bootParameters[par_row[1], par_row[2], 1, , ], 1, function(x) { quantile(x, probs = c(.05, . 95))}) # drawing shadows polygon (x = c (1:20, those days), y = c (CIs [1], rev (CIs) [2]), col = alpha (see colour = cols [I], alpha. = 3), border=FALSE) } #Copy the code
The figure shows a portion of the time-varying VAR parameters estimated above. The top line shows a visualization of the VAR parameters for the estimated points 8,15, and 18. The solid blue arrow indicates a positive relationship and the dotted red arrow indicates a negative relationship. The width of the arrow is proportional to the absolute value of the corresponding argument.
If you have any questions, please comment below.
Most welcome insight
1.R language multiple Logistic Logistic regression application case
2. Panel smooth transfer regression (PSTR) analysis case implementation
3. Partial least squares regression (PLSR) and principal component regression (PCR) in MATLAB
4.R language Poisson regression model analysis cases
5. Hosmer-lemeshow goodness of fit test in R language regression
6. Implementation of LASSO regression, Ridge regression and Elastic Net model in R language
7. Realize Logistic Logistic regression in R language
8. Python predicts stock prices using linear regression
9. How to calculate IDI and NRI indices for R language in survival analysis and Cox regression