Original link:tecdat.cn/?p=6166

Original source:Tuo End number according to the tribe public number

 

Whenever possible, we should check that the model we assume is correctly specified before relying on the model to draw conclusions or predict future results. That is, the data does not conflict with the assumptions made by the model. For binary results, logistic regression is the most popular modeling method. In this article, we will look at goodness of fit tests for Hosmer-Lemeshow logistic regression.

 

Hosmer-lemeshow goodness of fit test

The Hosmer-Lemeshow goodness of fit test is based on separating samples according to predicted probability or risk. Specifically, based on the estimated parameter values, for each observation in the sample, the probability is calculated based on the covariable values of each observation.

 

The observations in the sample are then divided into groups G (let’s go back and choose G) based on the predicted probability of the sample. Let’s say (usually) that g is equal to 10. Then the first group consisted of observations with the lowest 10% probability of prediction. The second group consisted of 10 percent of the sample with the lowest probability of prediction.

In practice, as long as some of our model covariables are continuous, each observation will have a different probability of prediction, so the probability of prediction will vary in each group we form. To calculate the number of observations we expected, the Hosmer-Lemeshow test took the average of the predicted probabilities across the group and multiplied it by the number of observations in the group. The test performs the same calculation and then computes the Pearson goodness of fit statistic.

 

   

Select the number of groups

As far as I can see, there is very little guidance on how to choose the group g. Hosmer and Lemeshow’s simulation results are use-based and suggest that if we have 10 covariables in the model.

Intuitively, using a smaller g value reduces the probability of detecting errors.

 

R 

First, we will use a covariable X to simulate some data in the logistic regression model, and then fit the correct logistic regression model.

N < -100 x < -rnorm (n) xb < -x pr < -exp (xb)/(1 + exp (xb)) y < -1 * (runif (n) <pr) mod < -glm (y~x, family = binomial)Copy the code

Next, we pass the result Y and the model fitting probability to the hoslem.test function, and select g = 10 groups:

Hosmer and Lemeshow goodness of fit (GOF) test data: mod$Y, FITTED (mod) X-squared = ac, DF = ac, P-value = ACCopy the code

This gives p = 0.49, indicating that the null hypothesis cannot be rejected and the fitting effect is good. We can also get a table of observations and expected values from our HL object:

Cbind (HL $observed, HL $expected) y0 y1 yhat0 yhat1 [0.0868,0.219] 8 2 8.259898 1.740102 (0.219,0.287] 7 3 7.485661 2.514339 (0.287,0.329] 7 3 6.968185 3.031815 (0.329,0.421] 8 2 6.194245 3.805755 (0.421,0.469] 55 5.510363 4.489637 (0.469,0.528] 46 4.983951 5.016049 (0.528,0.589) 5 5 4.521086 5.478914 (0.589,0.644) 28 3.833244 6.166756 (0.644,0.713] 64 3.285271 6.714729 (0.713,0.913) 19 1.958095 8.041905Copy the code

To help us understand the calculations, let’s now run the tests ourselves manually. First, we calculate the model prediction probability, and then classify the observed values according to the deciles of the prediction probability:

Pihat < fitted pihatCAT < -mod FITTED pihatcat < -cut (pihat, BRKS = ac (0, pihat (PI, acl, acl)), els=FALSE) pihat < -mod fitted pihatCAT < -cut (PIhat, ACL = AC (PI, ACL, ACL))Copy the code

Next, we loop through groups 1 to 10, counting the observed number of zeros and ones, and counting the expected number of zeros and ones. To calculate the latter, we find the mean of the predicted probabilities in each group and multiply it by the group size, which is 10 here:

Meanprobs < -array (0, dim=c(10,2)) expevents < -array (0, dim=c(10,2)) obsevents < -array (0, 2) Dim =c(10,2)) for (I in 1:10) {meanprobs[I,1] < -mean (pihat[pihatcat== I]) obsevents[I,2] < -sum (1-y[pihatcat== I])}Copy the code

Finally, we can calculate the Hosmer-Lemeshow test statistics by summing up the (observed expectations) ^ 2 / expectations in the table’s 10×2 cells:

[1] of 7.486643Copy the code

Matches the test statistics of the hoslem.test function.

Change the number of groups

Next, let’s see how the p value of the test changes, because we choose g = 5, g = 6, up to g = 15. We can do this with a simple for loop:

For (I in 5:15) {print (hoslem.test (mod $y, fits (mod), g = I) $p.value)}Copy the code
[1] 0.4683388
[1] 0.9216374
[1] 0.996425
[1] 0.9018581
[1] 0.933084
[1] 0.4851488
[1] 0.9374381
[1] 0.9717069
[1] 0.5115724
[1] 0.4085544
[1] 0.8686347
Copy the code

Although the P values changed, they were obviously not significant, so they came to similar conclusions, and there was no evidence of inappropriate. Therefore, for this data set, the choice of different G values does not seem to affect the substantive conclusions.

Check the Hosmer-Lemeshow test by simulation

To complete, let’s run some simulations to check how the Hosmer-Lemeshow test performs on repeated samples. First, we will repeatedly sample from the same model previously used, fit the same (correct) model, and calculate the Hosmer-Lemeshow P value using g = 10. We’ll do this 1000 times and store the test p values in an array:

Pvalues < -array (0,1000) for (I in 1: 1000) {n < -100 x < -rnorm (n) pr < -exp (xb)/(1 + exp (xb)) mod < -glm (y~x, family = binomial)}Copy the code

After completion, we can calculate the proportion of p values less than 0.05. Since the model is specified correctly here, we expect this so-called type 1 error rate to be no more than 5% :

[1] of 0.04Copy the code

Thus, in 1,000 simulations, the Hosmer-Lemeshow test gave a significant P-value 4% of the time, indicating inappropriate. So the test incorrectly indicated that 5% of the cases we expected were inappropriate — it seemed to work.

Now let’s change the simulation so that the model we fit is incorrectly specified and should be hard to fit with the data. Hopefully, we’ll find that the Hosmer-Lemeshow test correctly finds inappropriate evidence in 5% of cases. Specifically, we will now generate logical models that follow with covariates, but we will continue to use linear covariates to fit the model so that our fitting model is incorrectly specified.

 
Copy the code

We found that the proportion with a P value less than 0.05 was calculated

[1] of 0.648Copy the code

Thus, the Hosmer-Lemeshow test provided us with 65% of the significant evidence of inappropriateness.

 


Most welcome insight

1.R language multiple Logistic Logistic regression application case

2. Panel smooth transfer regression (PSTR) analysis case implementation

3. Partial least squares regression (PLSR) and principal component regression (PCR) in MATLAB

4.R language Poisson regression model analysis cases

5. Hosmer-lemeshow goodness of fit test in R language regression

6. Implementation of LASSO regression, Ridge regression and Elastic Net model in R language

7. Realize Logistic Logistic regression in R language

8. Python predicts stock prices using linear regression

9. How to calculate IDI and NRI indices for R language in survival analysis and Cox regression