Original link:tecdat.cn/?p=6193

Original source:Tuo End number according to the tribe public number

 

A copula is a function that couples a multivariable distribution function with its edge distribution function, often called an edge. Copula is an excellent tool for modeling and simulating related random variables. The main attraction of Copula is that by using them, you can model the related structures and edges (the distribution of each random variable) separately.

How does copulas work

First, let’s understand how Copula works.

Set.seed (100) m < -3 n < -2000 z < -mvrnorm (n, mu = rep (0, m), Sigma = Sigma, empirical = T)Copy the code

We use COR () and scatter plot matrix to check sample correlation.

Pairs. Panels (Z) [, 1] [, 2] [, 3] [1,] 0.1937548 0.7890814 1.0000000]Copy the code


Pairs. Panels (U)Copy the code

This is the scatter matrix U with the new random variables.



We can draw a 3D graph of a vector that represents u.

 

Now, as a final step, we just need to select the edge and apply it. I selected the edges Gamma, Beta, and Student, using the parameters specified below.

X1 < -qgamma (u [, 1], shape = 2, scale = 1) x2 < -qbeta (u [,2],2,2) x3 < -qt (u [, 3], df = 5)Copy the code

Below is a 3D diagram of our simulated data.

 

Df < -cbind (x1, x2, X3) Pairs. Panels (DF) x1 x2 x3 1.0000000 0.3812244 0.1937548 x2 0.3812244 1.0000000-0.7890814 x3 0.1937548-0.7890814 1.0000000Copy the code

Here is the scatter matrix of random variables:

 

The use of copulas connect

Let’s use Copula to replicate the above procedure.

Now that we have specified the dependency structure and set the edges through the Copula (plain Copula), the MVDC () function generates the required distribution. We can then use the RMVDC () function to generate a random sample.

Colnames (Z2) < -c (" x1 ", "x2", "x3") pairs. Panels (Z2)Copy the code

The simulated data is of course very close to the previous data, as shown in the scatterplot matrix below:

 

Simple application example

Now for the real world example. We will fit two stocks and try to use Copula simulation.

Let’s load in R:

CSV ('cree_r.csv', header = F) $V2 Yahoo < -read.csv ('yahoo_r.csv', header = F) $V2Copy the code

Before jumping straight into the Copula fitting process, let’s examine the correlation between the two stock returns and plot the regression line:

We can see the positive correlation:

 

In the first example above, I chose a normal Copula model, but when applying these models to real data, you should think carefully about what works better for the data. For example, many Copulas are better suited to modeling asymmetric correlations, others emphasize tail correlations, and so on. My stock yield guess is that t-Copula should be fine, but guessing is definitely not enough. Essentially, this allows us to perform copula selection using BIC and AIC via functions:

Pobs (As.matrix (Cbind (Cree, Yahoo)) [, 1] selectedCopula $PAR [1] 0.4356302 $PAR2 [1] 3.844534Copy the code

The fitting algorithm does select the T-Copula and estimates the parameters for us. Let’s try to fit the proposed model and check the parameter fit.

Seed (500) m < -pobs (as.matrix (cbind (Cree, Yahoo))) COEF (FIT) rho.1 df 0.43563 3.84453Copy the code

Let’s look at the density of the copula that we just estimated

Rho < -coef (fit) [1] df < -coef (fit) [2]Copy the code

Now we just need to set up the Copula and take 3,965 random samples from it.

RCopula (3965, tCopula (= 2, df = df)) [, 1] [, 2] [1,] 1.0000000 0.3972454 [2,] 0.3972454 1.0000000Copy the code

Here is a diagram of the included sample:

 

T-copula usually applies to phenomena where there is a high correlation in extreme values (the tail of the distribution). Now we face the challenge of modeling the edges. For simplicity, we will assume a normal distribution. Therefore, we estimate the parameters of the edge.

The histogram appears as follows:

Now we apply copula to the function to get simulated observations from the generated multivariable distribution. Finally, we compare the simulation results with the original data.

This is the final scatter plot of the data assuming the t-Copula of normal distribution edges and dependent structure:

 

As you can see, the T-Copula results in results that approximate actual observations.

Let’s try df=1 and df=8:

Obviously, the parameter df is important for determining the shape of the distribution. As DF increases, t-Copula tends to be normally distributed Copula.

Thank you very much for reading this article, please leave a comment below if you have any questions!


reference

1. Machine learning to identify changing stock market conditions — the application of hidden Markov model (HMM)

2. Garch-dcc model and DCC (MVT) modeling estimation in R language

3.R language implementation Copula algorithm modeling dependency case analysis report

4.R language COPULAS and VaR analysis of financial time series data

5.R language multivariate COPULA GARCH model time series prediction

6. Use R language to realize neural network to predict stock cases

7. Realization of R language volatility prediction: ARCH model and HAR-RV model

8.R language how to do Markov switching model

9. Matlab uses Copula simulation to optimize market risks