Provide a summary for each variable of the number, percent missings, and cumulative sum of missings of the order of the variables. By default, it orders by the most missings in each variable.
Arguments
- data
a data.frame
- order
a logical indicating whether to order the result by
n_miss
. Defaults to TRUE. If FALSE, order of variables is the order input.- add_cumsum
logical indicating whether or not to add the cumulative sum of missings to the data. This can be useful when exploring patterns of nonresponse. These are calculated as the cumulative sum of the missings in the variables as they are first presented to the function.
- digits
how many digits to display in
pct_miss
column. Useful when you are working with small amounts of missing data.- ...
extra arguments
Note
n_miss_cumsum
is calculated as the cumulative sum of missings in the
variables in the order that they are given in the data when entering
the function
See also
pct_miss_case()
prop_miss_case()
pct_miss_var()
prop_miss_var()
pct_complete_case()
prop_complete_case()
pct_complete_var()
prop_complete_var()
miss_prop_summary()
miss_case_summary()
miss_case_table()
miss_summary()
miss_var_prop()
miss_var_run()
miss_var_span()
miss_var_summary()
miss_var_table()
n_complete()
n_complete_row()
n_miss()
n_miss_row()
pct_complete()
pct_miss()
prop_complete()
prop_complete_row()
prop_miss()
Examples
miss_var_summary(airquality)
#> # A tibble: 6 × 3
#> variable n_miss pct_miss
#> <chr> <int> <num>
#> 1 Ozone 37 24.2
#> 2 Solar.R 7 4.58
#> 3 Wind 0 0
#> 4 Temp 0 0
#> 5 Month 0 0
#> 6 Day 0 0
miss_var_summary(oceanbuoys, order = TRUE)
#> # A tibble: 8 × 3
#> variable n_miss pct_miss
#> <chr> <int> <num>
#> 1 humidity 93 12.6
#> 2 air_temp_c 81 11.0
#> 3 sea_temp_c 3 0.408
#> 4 year 0 0
#> 5 latitude 0 0
#> 6 longitude 0 0
#> 7 wind_ew 0 0
#> 8 wind_ns 0 0
if (FALSE) {
# works with group_by from dplyr
library(dplyr)
airquality %>%
group_by(Month) %>%
miss_var_summary()
}