Original link: tecdat.cn/?p=10134


I ran a small simulation to test Little’s MCAR test 1 with different sample sizes. I can study heteroscedasticity in linear regression. I was able to find examples of small samples of researchers using Little’s MCAR test, so I ran simulations.

library(BaylorEdPsych) library(simglm) library(ggplot2) library(dplyr) library(mice) fixed <- ~1 + age + income Fixed_param < -c (2, 0.3, 1.3) cov_param < -list (dist_fun = c('rnorm', 'rnorm'), var_type = c("single", "single"), opts = list(list(mean = 0, sd = 4), list(mean = 0, sd = 3)))Copy the code
ggplot(little.mcar.p, aes(x = n, y = p)) + geom_boxplot() +
  geom_crossbar(aes(ymin = q025, y = q05, ymax = q075), data = summarise(
    group_by(little.mcar.p, n), q025 = quantile(p, .025, na.rm = TRUE),
    q05 = quantile(p, .05, na.rm = TRUE), q075 = quantile(p, .075, na.rm = TRUE)
  )) +
  geom_hline(yintercept = .05) +
  scale_y_continuous(breaks = seq(0, 1, .05), limits = c(0, 1)) +
  labs(x = "Sample size", y = "p-value",
       title = "Little's MCAR test for data that are MCAR",
       subtitle = "2000 replications",
       caption = paste(paste("For the narrow boxes, going from top to bottom, lines",
                             "represent 7.5th, 5th and 2.5th percentiles of p-values."),
                       "Test maintains nominal error rate across wide range of sample sizes.",
                       sep = "\n"))
Copy the code

 

ggplot(little.mcar.p.mar, aes(x = n, y = p)) + geom_boxplot() +
  geom_crossbar(aes(ymin = q925, y = q95, ymax = q975), data = summarise(
    group_by(little.mcar.p.mar, n), q925 = quantile(p, .925, na.rm = TRUE),
    q95 = quantile(p, .95, na.rm = TRUE), q975 = quantile(p, .975, na.rm = TRUE)
  ), linetype = 2) +
  geom_hline(yintercept = .05) +
  scale_y_continuous(breaks = seq(0, 1, .05), limits = c(0, 1)) +
  labs(x = "Sample size", y = "p-value",
       title = "Little's MCAR test for data that are MAR",
       subtitle = "2000 replications",
       caption = paste(paste("For the dashed boxes, going from top to bottom, lines",
                             "represent 97.5th, 95th and 92.5th percentiles of p-values."),
                       "Test only maintains nominal error rate around sample size of 120.",
                       sep = "\n"))
Copy the code

The regression is near perfect (no multicollinearity).