Original link:tecdat.cn/?p=10207

Original source:Tuo End number according to the tribe public number


 

When OLS estimates are applied to regression models in the social sciences, one of the assumptions is homoscedasticity, and I prefer the constant error variance. This means that there is no systematic pattern of error variances, which means that the model is equally bad at all levels of prediction.

Heteroscedasticity is a complement to homoscedasticity and does not bias OLS. If you don’t care about p-values as much as most people in the social sciences do, heteroscedasticity may not be a problem.

Econometricians have developed a wide variety of heteroscedastic consistent standard errors, so they can continue to apply OLS while adjusting for variable error variances. The Wikipedia page for these corrections lists many of the names used for these alternative standard errors.

We provide likelihood functions, and both functions will find parameter estimates that maximize likelihood.

Let’s look at a simple example: ‘ ‘

First, I extracted 500 observations from the normal distribution of mean 3 and standard deviation 1.5 and saved them in the dataset:

dat <- data.frame(y = rnorm(n = 500, mean = 3, sd = 1.5))
Copy the code

The mean and standard deviation of samples are:

mean(dat$y)
[1] 2.999048

sd(dat$y)
[1] 1.462059
Copy the code

I could also ask the question, what are the parameters of the normal distribution, mean and standard deviation that maximize the probability of the observed variables?

m.sd <- mle2(y ~ dnorm(mean = a, sd = exp(b)), data = dat,
             start = list(a = rnorm(1), b = rnorm(1)))
Copy the code

In the syntax above, the mean value of R variable y is a constant A, and the standard deviation of Y is a constant B. I’m going to power the standard deviation to make sure that it’s never negative. We provide the initial value so that it can start estimating before converging to a value that maximizes the likelihood. Random numbers are sufficient to satisfy initial values.

m.sd

Call:
mle2(minuslogl = y ~ dnorm(mean = a, sd = exp(b)), start = list(a = rnorm(1),
    b = rnorm(1)), data = dat)

Coefficients:
        a         b
2.9990478 0.3788449

Log-likelihood: -898.89
Copy the code

The coefficient A is very similar to the average of the data. The coefficient b must be exponentiated to obtain the standard deviation:

exp(coef(m.sd)[2])
       b
1.460596
Copy the code

This is similar to the standard deviation we obtained above. Another interesting fact about the syntax demonstrated above is that lm() resembles the function coef(), summary() and can be used on mle2() objects.

The maximum likelihood estimate we performed above is similar to the intercept only regression model using OLS estimation:

coef(lm(y ~ 1, dat))
(Intercept)
   2.999048

sigma(lm(y ~ 1, dat))
[1] 1.462059
Copy the code

The intercept is the mean value of the data, and the residual standard deviation is the standard deviation.

Heteroscedasticity regression model

Consider the following study. We assigned two groups, a treatment group of 30 people, and a control group of 100 people each, and the treatment groups were matched by covariates that determined the outcome. Therefore, we are interested in the therapeutic effect and let us assume that a simple mean difference is sufficient. As it happens, in addition to being effective, the treatment is also homogenous, with, for example, better outcomes after subjects are brainwashed. The following data sets should conform to the above protocol:

Copy the code

There were 100 participants who had a treatment status of 0 (control group) with a mean of 0 and a standard deviation of 1. Thirty participants had a treatment status of 1 (treatment group), with a mean of 0.3, a standard value of 1 and a deviation of 0.25.

This situation obviously violates the homoscedasticity hypothesis, however, we continue to perform OLS estimation of the treatment effect:


Call:

Residuals:
    Min      1Q  Median      3Q     Max
-2.8734 -0.5055 -0.0287  0.4231  3.4097

Coefficients:
            Estimate Std. Error t value Pr(>|t|)
(Intercept)  0.03386    0.09298   0.364    0.716
treat        0.21733    0.19355   1.123    0.264

Residual standard error: 0.9298 on 128 degrees of freedom
Multiple R-squared:  0.009754,	Adjusted R-squared:  0.002018
F-statistic: 1.261 on 1 and 128 DF,  p-value: 0.2636
Copy the code

The therapeutic effect was 0.22, not statistically significant, p= 0.26, P =.26 at a grade of.05. But we know that the variance is not homoscedasticity because we created the data and a simple diagnostic graph of the residuals against the fitted values confirms this:

Copy the code

First, I’ll document recreating the OLS model:

Copy the code

In this function, I create a model for the average of the results, which is a function of the intercept, b_INT, and a coefficient of the treatment predictor, b_treat. The standard deviation is still an exponential constant. The model will be equivalent to the linear model.

But, we know that the variance is not constant, that the two groups are different. We can specify the standard deviation as a function of the group:

Copy the code

Here, we specify a model for the standard deviation as a function of the intercept s_int, representing the control group, and the deviation from the intercept is s_treat.

We can do better. We can use coefficients from OLS models as initial values b_INT and b_treat. Operating model:



Maximum likelihood estimation

Call:
(minuslogl = y ~ dnorm(mean = b_int + b_treat * treat, sd = exp(s_int +
    s_treat * treat)), start = list(b_int = coef(m.ols)[1], b_treat = coef(m.ols)[2],
    s_int = rnorm(1), s_treat = rnorm(1)))

Coefficients:
         Estimate Std. Error  z value   Pr(z)    
b_int    0.033862   0.104470   0.3241 0.74584    
b_treat  0.217334   0.112249   1.9362 0.05285 .  
s_int    0.043731   0.070711   0.6184 0.53628    
s_treat -1.535894   0.147196 -10.4344 < 2e-16 ***
---
Signif. codes:  0'* * *'0.001'* *'0.01'*'0.05'. '0.1' '1

-2 log L: 288.1408
Copy the code

The treatment is about the same, but now the P is.053. Far less than 0.26 assumed for pure analysis of variance. The b_treat variable is much more accurate because the standard error here is.11 less than.19.

The standard deviation model suggests that the standard deviation is:

exp(coef(m.het)[3])

   s_int
1.044701
Copy the code

Control group and 1.045:

exp(coef(m.het)[3] + coef(m.het)[4])

    s_int
0.2248858
Copy the code

22 was the treatment group. These values are close to what we know about simulations. We can confirm the sample statistics as follows:


  treat         y
1     0 1.0499657
2     1 0.2287307
Copy the code

Models can also be easily compared when there is no heteroscedasticity and heteroscedasticity is allowed:


Likelihood Ratio Tests
Model 1: m.mle, y~dnorm(mean=b_int+b_treat*treat,sd=exp(s1))
Model 2: m.het, y~dnorm(mean=b_int+b_treat*treat,sd=exp(s_int+s_treat*treat))
  Tot Df Deviance  Chisq Df Pr(>Chisq)    
1      3   347.98                         
2      4   288.14 59.841  1  1.028 e-14 ***
---
Signif. codes:  0'* * *'0.001'* *'0.01'*'0.05'. '0.1' '1
Copy the code

The likelihood ratio test suggested that we improved the model, χ2 (1) =59.81, P < 0.001χ2 (1) =59.81, P <.001.

Therefore, we can confirm that reciprocal modeling improves accuracy in this single example. When the effect is zero and we have heteroscedasticity, it is easy to write simulation code that compares heteroscedasticity MLE to OLS estimates.

I changed the code from above by giving the treatment group an average of zero so that there was no mean difference between the two groups. I repeated the process 500 times, saving therapeutic effect from OLS and its P value, and saving therapeutic effect from heteroscedasticity MLE and its P value.

Then, I draw the result:


par(mfrow = c(1.1))
Copy the code

OLS and heteroscedastic MLE had similar therapeutic effects. However, when NULL is true, the p-value of the heteroscedastic MLE model performs better. If null is true, the p values can be expected to be evenly distributed. The P values of OLS iterations are stacked at the high end.

This time, I repeated the process so that the mean for the treatment group was 0.15, so that the null hypothesis of zero effect was false.

Again, the therapeutic effect has the same distribution. However, compared with OLS, the P value of heteroscedasticity MLE was much smaller and had greater statistical power to detect therapeutic effects.


First, you specify a function for negative logarithmic possibilities, and then you pass this function to the MLE.


(minuslogl = ll, start = list(b_int = rnorm(1), b_treat = rnorm(1),
    s_int = rnorm(1), s_treat = rnorm(1)))

Coefficients:
         Estimate Std. Error  z value   Pr(z)    
b_int    0.033862   0.104470   0.3241 0.74584    
b_treat  0.217334   0.112249   1.9362 0.05285 .  
s_int    0.043733   0.070711   0.6185 0.53626    
s_treat -1.535893   0.147196 -10.4343 < 2e-16 ***
---
Signif. codes:  0'* * *'0.001'* *'0.01'*'0.05'. '0.1' '1

-2 log L: 288.1408
Copy the code

 


Family: gaussian  ( identity )
Formula:          y ~ treat
Dispersion:         ~treat
Data: dat

    AIC      BIC   logLik deviance df.resid
  296.1    307.6   -144.1    288.1      126


Conditional model:
           Estimate Std. Error z value Pr(>|z|)  
(Intercept)  0.03386    0.10447   0.324   0.7458  
treat        0.21733    0.11225   1.936   0.0528 .
---
Signif. codes:  0'* * *'0.001'* *'0.01'*'0.05'. '0.1' '1

Dispersion model:
           Estimate Std. Error z value Pr(>|z|)    
(Intercept)  0.08746    0.14142   0.618    0.536    
treat       -3.07179    0.29439 -10.434   <2e-16 ***
---
Signif. codes:  0'* * *'0.001'* *'0.01'*'0.05'. '0.1' '1
Copy the code

In this case, the dispersion is within the range of logarithmic variance, so the exponential logarithmic variance square root of the square must be taken to retrieve the group standard deviation described above.