Original link:tecdat.cn/?p=22566 

Original source:Tuo End number according to the tribe public number

 

This article is about extreme value inference. We use the maximum likelihood method on the generalized Pareto distribution.

  • Maximum likelihood estimation

In the context of parametric models, the standard technique is to consider maximum likelihood (or logarithmic likelihood). Considering some technical assumptions such asSome neighborhood of phi, then

Among themRepresents the Fisher information matrix. Consider here some samples from the generalized Pareto distribution with the parameters, so

 

If we solve the first order condition for maximum likelihood, we get an estimate that satisfies the following conditions

This asymptotic normality is conceptualized as follows: if the true distribution of the sample is one with parametersIf n is large enough, there will be a joint normal distribution. Therefore, if we produce a large number of samples (large enough, say 200 observations), then the estimated scatter plot should be the same as the gaussian scatter plot.


> for(s in 1:1000){
+ param[s,]=gpd(x,0)$par.ests


> image(x,y,z)
Copy the code

I get a 3D representation


> persp(x,y,t(z)
+ xlab="xi",ylab="sigma")
Copy the code

 

 

With 200 observations, if the real base distribution is GPD, then the joint distributionIt’s normal.

  • DeltaThe delta method

Another important property is the Delta method. The idea is that if it’s asymptotically normal, smooth enough, then it’s asymptotically Gaussian.

And from this property, we can get(this is another parameterization used in extremum models), or in any quartileOn. Let’s run some simulations to check the joint normality again.


> for(s in 1:1000)
+ gpd(x,0)$par.ests
+ q=sha * (.01^(-xih) - 1)/xih
+ tvar=q+(sha + xih * q)/(1 - xih)
dmnorm(cbind(vx,vy),m,S)
> image(x,y,t(z)
Copy the code

As we can see, we can’t use this incremental result with a sample size of 200: it looks like we don’t have enough data. But if we run the same code at n=5000,

 
Copy the code
> n=5000
Copy the code
 

We getandAssociative normality of. This is the delta- method that we can derive from this result.



  • Profile Likelihood

Another interesting way to do this isProfileThe concept of likelihood functions. Because the tail exponent.In this case, auxiliary parameters.

And you can use that to derive confidence intervals. In the case of GPD, for eachWe have to find an optimal one. We calculatedProfileThe likelihood function, i.e. And we can calculate the maximum likelihood of this contour. In general, this two-stage optimization is not equivalent to (global) maximum likelihood, and the calculation results are as follows

 


+  profilelikelihood=function(beta){
+  -loglik(XI,beta) }
+  L[i]=-optim(par=1,fn=profilelik)$value }
Copy the code

 

 

If we want to calculate the maximum contour likelihood (rather than just the value of contour likelihood on the grid as before), we use

+ profile=function(beta){+ -loglikelihood(XI,beta)} (OPT=optimize(f=PL,interval=c(0,3))Copy the code

We get the result and the maximum likelihood estimateSimilar. We can compute confidence intervals in this way, visualize them on a graph


> line(h=-up-qchisq(p=.95,df=1)
>  I=which(L>=-up-qchisq(p=.95,df=1))
>  lines(XIV[I]
Copy the code

 

The vertical line is the parameterThe lower and upper bounds of the 95% confidence interval.


Most welcome insight

1. POT superthreshold model and extremum theory analysis of R language

2. Extreme value Theory OF R language EVT: Analysis of fire Loss distribution based on GPD model

3.R language has extreme value (EVT) dependent structure of Markov chain (MC) for flood extreme value analysis

4. Hosmer-lemeshow goodness of fit test in R language regression

5. Realize markov switching ArMA-GARCH model estimation of MCMC by MATLAB

6. Regression analysis of R language interval data

7.R Language WALD test vs. likelihood ratio test

8. Python predicts stock prices using linear regression

9. How to calculate IDI and NRI indices for R language in survival analysis and Cox regression