Original link:tecdat.cn/?p=21425 

Original source:Tuo End number according to the tribe public number

 

Extreme value theory focuses on the tail characteristics of risk loss distribution and is usually used to analyze events with rare probability. It can rely on a small amount of sample data to obtain the changes of extreme values in the overall distribution when the overall distribution is unknown, and has the estimation ability beyond sample data. Therefore, the model based on generalized Pareto distribution (GPD) can make more effective use of the limited catastrophe loss data information, thus becoming the mainstream technology of extreme value theory.

In view of the characteristics of low frequency, high loss, insufficient data and thick tail of catastrophe, the GPD model is used to conduct statistical modeling of economic loss data of fire. The shape parameters and scale parameters are estimated. The model test shows that the GPD model has a good fitting effect and precision for the feature of thick tail of catastrophe risk, which provides a theoretical basis for the modeling of catastrophe risk estimation and the pricing of catastrophe bonds.

Fire loss data

The data used in this article were collected at reinsurance companies and included 2,167 fire losses between 1980 and 1990. Inflation has been adjusted for. The total claim has been broken down into loss of buildings and loss of profits.

base1=read.table( "dataunivar.txt",
 header=TRUE)
base2=read.table( "datamultiva.txt",
 header=TRUE)
Copy the code

Considering the first data set (so far, we’ve been dealing with univariate extremes),


> D=as.Date(as.character(base1$Date),"%m/%d/%Y")
> plot(D,X,type="h")
Copy the code

Here’s the chart:

And then a natural idea is visualization

For example,


> plot(log(Xs),log((n:1)/(n+1)))
Copy the code

Linear regression

The point here is on a line. The slope can be found by linear regression,


lm(formula = Y ~ X, data = B)
lm(Y~X,data=B[(n-500):n,])
lm(formula = Y ~ X, data = B[(n - 100):n, ])
Copy the code

Heavy-tailed distribution.

The slope here has to do with the exponent of the tail of the distribution. Consider some heavy-tailed distributions

Since natural estimators are order statistics, the slope of a line is opposite to the tail exponent. The estimated value of the slope is (considering only the largest observed value)

 

Hill estimate

The Hill estimate is based on the assumption that the denominator above is almost 1.

So you get the convergence assumption. further

Based on this distribution, an asymptotic confidence interval can be obtained

> xi = 1 / (1: n) * cumsum (logXs) - logXs > xise = 1.96 / SQRT (1: n) * xi > polygon (c (1: n, n: 1), c (xi + xise, rev (xi - xise)),Copy the code

Incremental approach

Similarly (again with additional assumptions about the rate of convergence)

(Obtained using an incremental method). Again, we can use this result to get (asymptotic) confidence intervals

> alphase = 1.96 / SQRT (1, n)/xi > polygon (c (1: n, n: 1), c (alpha + alphase, rev (alpha - alphase)),Copy the code

 

Deckers – einmal – DE – Haan estimator

Then (again considering the condition of convergence rate, i.e.),

 

Pickands estimates

 

Due to the.

 

code

> xi=1/log(2)*log( (Xs[seq(1,length=trunc(n/4),by=1)]- + Xs[seq(2,length=trunc(n/4),by=2)])/ > Xise = 1.96 / SQRT (seq (1, length = trunc (n / 4), by = 1)) + * SQRT (xi ^ 2 * (2 ^ (xi + 1) + 1)/(2 * * log (2 ^ xi - 1) (2)) ^ 2)) > polygon(c(seq(1,length=trunc(n/4),by=1),rev(seq(1,Copy the code

 

GPD distribution was fitted

Maximum likelihood methods can also be used to fit GPD distributions at high thresholds.

> GPD $n [1] 2167 $threshold [1] 5 $P.les.thresh [1] 0.8827873 $n.exceed [1] 254 $method [1] "ML" $par.ests xi beta $par.ses xi beta 0.1117143 0.4637270 $varcov [,1] [,2] [1,] 0.01248007-0.03203283 [2,] -0.03203283 $information [1] "observed" $CONVERGED [1] 0 $NllH.final [1] 754.1115attr (,"class") [1] "GPD"Copy the code

Or equivalent

> gpD. fit $threshold [1] 5 $nexc [1] 254 $conv [1] 0 $NLLH [1] 754.1115 $mle [1] 3.8078632 0.6315749 $rate [1] 0.1172127 0.4636270 0.1116136 $se [1]Copy the code

It can visualize the contour likelihood of the tail index,

> gpd.prof
Copy the code

 

or

> gpd.prof
Copy the code

 

Therefore, the maximum likelihood estimator of tail index can be plotted as a function of threshold (including confidence interval),

Vectorize (function (u) {GDP (X, u) $par. Ests [1]}) plot (u, XI, ylim = c (0, 2)) segments (u, XI - 1.96 * XIS, u, XI +Copy the code

 

Finally, you can use the block maximum technique.

Gev. fit $CONV [1] 0 $NLLH [1] 3392.418 $MLE [1] 1.4833484 0.5930190 0.9168128 $SE [1] 0.01507776 0.01866719 0.03035380Copy the code

The estimate of the tail exponent is the last coefficient here.


Most welcome insight

1. Empirical research on R language fitting and prediction based on ArMA-GarCH-VAR model

2. Stochastic model of time-varying parameter VAR in R language

3. Stochastic model of time-varying parameter VAR in R language

4. VAR fitting and prediction based on ARMA-GARCH process for R language

5. VaR comparison of GARCH (1,1), MA and historical simulation method

6. Stochastic model of time-varying parameter VAR in R language

7.R language to achieve vector automatic regression VAR model

8.R language random search variable selection SSVS estimation Bayesian vector autoregression (BVAR) model

9. Impulse response analysis of different types of VAR models in R language