Original link:tecdat.cn/?p=9390

Original source:Tuo End number according to the tribe public number

 


introduce

The general disadvantage of vector autoregression (VAR) models is that the number of estimated coefficients increases proportionately to the number of lags. Therefore, as the number of lags increases, less information is available for each parameter. In the Bayesian VAR literature, one approach to mitigate this so-called curse of dimension is random search variable selection (SSVS), proposed by George et al. (2008). The basic idea of SSVS is to assign the commonly used prior variances to parameters that should be included in the model, and to approach zero the prior variances of unrelated parameters. In this way, the relevant parameters can usually be estimated, and the posterior values of the independent variables are close to zero, so they do not have a significant effect on the prediction and impulse response. This is achieved by adding a hierarchy before the model in which the correlation of variables is assessed at each step of the sampling algorithm.

This paper introduces bayesian vector autoregression (BVAR) models using SSVS estimation. It uses Lutkepohl’s (2007) dataset E1, which contains data on German fixed investment, disposable income and consumer spending from q1 1960 to q4 1982. Load data and generate data:


Load and convert data
e1 <- diff(log(e1))

# generate VAR
data <- gen_var(e1, p = 4, deterministic = "const")

Get the data matrix
y <- data$Y[, 1:71]
x <- data$Z[, 1:71]
Copy the code

estimates

The prior variances of parameters were set according to the semi-automatic method described by George et al. (2008). For all variables, the prior inclusion probability is set to 0.5. The prior information of error variance – covariance matrix is insufficient.

# Resetting random numbers improves repeatability
set.seed(1234567)

t <- ncol(y) # observation number
k <- nrow(y) Number of endogenous variables
m <- k * nrow(x) # Estimate the number of coefficients

# Coefficient prior
a_mu_prior <- matrix(0, m) # Vector of prior mean

# SSVS prior (semi-automatic method)
ols <- tcrossprod(y, x) %*% solve(tcrossprod(x)) # OLS estimates
sigma_ols <- tcrossprod(y - ols %*% x) / (t - nrow(x)) # OLS error covariance matrix
cov_ols <- kronecker(solve(tcrossprod(x)), sigma_ols)
se_ols <- matrix(sqrt(diag(cov_ols))) # OLS standard error
 

# prior parameter
prob_prior <- matrix(0.5, m)

# variance-covariance matrix
u_sigma_df_prior <- 0 # variance-covariance matrix
u_sigma_scale_prior <- diag(0, k) # Prior covariance matrix
u_sigma_df_post <- t + u_sigma_df_prior # Posterior degrees of freedom
Copy the code

The initial parameter value is set to zero, which means that all parameters should be evaluated relatively freely in the first step of the Gibbs sampler.

SSVS can be added directly to the standard Gibbs sampler algorithm of the VAR model. In this example, constant terms are excluded from SSVS, which can be implemented by specifying include = 1:36. The output of a Gibbs sampler with SSVS can be further analyzed in the usual way. Therefore, point estimation can be obtained by calculating the drawing method of parameters:

# # # # invest income cons invest, 1-0.102-0.011-0.002 # # income. 0.044 0.031 0.168 1 # # cons. 1 # # 0.074 to 0.140 0.287 Invest.2 -0.013 0.002 0.004 ## income.2 0.015 0.004 0.315 ## cons.2 0.027-0.001 0.006 ## invest.3 0.033 0.000 0.000 ## 1 ## invest. 3 0.250 0.001-0.005 ## invest.4 0.250 0.001-0.005 ## invest.4 -0.064-0.010 ## cons.4 -0.023 0.001 0.000 ## const 0.014 0.017 0.014Copy the code

You can also obtain a posterior probability for each variable by calculating the mean of the variables. As you can see from the output below, only a few variables seem to be related in the VAR (4) model. The constant terms have a 100% probability because they are excluded from SSVS.

## invest invest ## invest.1 0.43 0.23 0.10 ## invest.1 0.18 0.67 ## invest.1 0.11 0.40 0.77 ## invest ## invest.3 0.019 0.07 0.06 ## income.3 0.06 0.13 0.10 ## Cons.3 0.09 0.07 0.12 ## invest.4 0.78 0.09 0.16 ## income.4 0.13 0.09 0.18 ## cons.4 0.09 0.07 0.06 ## const 1.00 1.00 1.00Copy the code

Given these values, the researcher can operate in a conventional manner and obtain a prediction and impulse response based on the output of the Gibbs sampler. The advantage of this method is that it takes into account not only parameter uncertainty but also model uncertainty. This can be illustrated by a histogram of the coefficients that describes the relationship between the first lagging term of income and the current value of consumption.

hist(draws_a[6,].Copy the code

 

The model uncertainty is described by two peaks and the parameter uncertainty is described by the distribution of right peaks around them.

However, if the researcher does not want to use a model and the correlation of variables may change from one step of the sampling algorithm to another, the alternative would be to use only a high-probability model. This can be done by further simulations in which very strict priors are used for unrelated variables and priors with no information are used for related parameters.

The mean extracted behind is similar to the OLS estimate in Lutkepohl (2007, Section 5.2.10) :

1 0.000 0.000 0.262 ## cons.1 0.000 0.238-0.334 ## Invest.2 0.000 0.000 0.001 ## income.2 0.000 0.000 0.329 ## cons.2 0.000 0.000 0.000 ## 3 0.000 0.000 0.000 ## invest.4 0.328 0.000 -- 0.001 ## income.4 0.000 0.000 0.000 ## Cons.4 0.000 0.000 0.000 ## const 0.015 0.015 0.014Copy the code

evaluation

The BVAR function can be used to collect the relevant output of the Gibbs sampler into a standardized object, such as Predict to obtain a prediction or IRF for impulse response analysis.

 

 hin(bvar_est, thin = 5)
Copy the code

To predict

You can use the function to obtain a prediction of the confidence interval predict.

plot(bvar_pred)
Copy the code

 

Impulse response analysis


plot(OIR
Copy the code


Most welcome insight

1. Matlab uses Bayesian optimization for deep learning

2. Matlab Bayesian hidden Markov HMM model implementation

3. Simple Bayesian linear regression simulation of Gibbs sampling in R language

4. Block Gibbs Sampling Bayesian multiple linear regression in R language

5. Bayesian model of MCMC sampling for Stan probabilistic programming in R language

6.Python implements Bayesian linear regression model with PyMC3

7.R language uses Bayesian hierarchical model for spatial data analysis

8.R language random search variable selection SSVS estimation Bayesian vector autoregression (BVAR) model

9. Matlab Bayesian hidden Markov HMM model implementation