Use Little's (1988) test statistic to assess if data is missing completely at random (MCAR). The null hypothesis in this test is that the data is MCAR, and the test statistic is a chi-squared value. The example below shows the output of mcar_test(airquality). Given the high statistic value and low p-value, we can conclude the airquality data is not missing completely at random.

mcar_test(data)

Arguments

data

A data frame

Value

A tibble::tibble() with one row and four columns:

statistic

Chi-squared statistic for Little's test

df

Degrees of freedom used for chi-squared statistic

p.value

P-value for the chi-squared statistic

missing.patterns

Number of missing data patterns in the data

Note

Code is adapted from LittleMCAR() in the now-orphaned BaylorEdPsych package: https://rdrr.io/cran/BaylorEdPsych/man/LittleMCAR.html. Some of code is adapted from Eric Stemmler - https://stats-bayes.com/post/2020/08/14/r-function-for-little-s-test-for-data-missing-completely-at-random/ using Maximum likelihood estimation from norm.

References

Little, Roderick J. A. 1988. "A Test of Missing Completely at Random for Multivariate Data with Missing Values." Journal of the American Statistical Association 83 (404): 1198--1202. https://doi.org/10.1080/01621459.1988.10478722.

Examples

mcar_test(airquality)
#> # A tibble: 1 x 4 #> statistic df p.value missing.patterns #> <dbl> <dbl> <dbl> <int> #> 1 35.1 14 0.00142 4
mcar_test(oceanbuoys)
#> # A tibble: 1 x 4 #> statistic df p.value missing.patterns #> <dbl> <dbl> <dbl> <int> #> 1 747. 31 0 6
# If there are non-numeric columns, there will be a warning mcar_test(riskfactors)
#> Warning: NAs introduced by coercion to integer range
#> # A tibble: 1 x 4 #> statistic df p.value missing.patterns #> <dbl> <dbl> <dbl> <int> #> 1 1741. 1319 3.32e-14 48