Data structures for missing data

Creation and Manipulation of Shadow Matrices

as_shadow()

Create shadows

as_shadow_upset()

Convert data into shadow format for doing an upset plot

bind_shadow()

Bind a shadow dataframe to original data

nabular()

Convert data into nabular form by binding shade to it

gather_shadow()

Long form representation of a shadow matrix

shade()

Create new levels of missing

shadow_long()

Reshape shadow data into a long format

unbind_shadow() unbind_data()

Unbind (remove) shadow from data, and vice versa

shadow_shift()

Shift missing values to facilitate missing data exploration/visualisation

shadow_shift(<numeric>)

Shift (impute) numeric values for graphical exploration

Create special missing values

Create special missing values so that they don’t get lost!

recode_shadow()

Add special missing values to the shadow matrix

Visualisation

Visualise missing data

geom_miss_point()

geom_miss_point

gg_miss_case()

Plot the number of missings per case (row)

gg_miss_case_cumsum()

Plot of cumulative sum of missing for cases

gg_miss_fct()

Plot the number of missings for each variable, broken down by a factor

gg_miss_span()

Plot the number of missings in a given repeating span

gg_miss_upset()

Plot the pattern of missingness using an upset plot.

gg_miss_var()

Plot the number of missings for each variable

gg_miss_var_cumsum()

Plot of cumulative sum of missing value for each variable

gg_miss_which()

Plot which variables contain a missing value

reexports

Objects exported from other packages

Numerical Summaries

Provide tidy data frame summaries of missingness

miss_var_prop() complete_var_prop() miss_var_pct() complete_var_pct() miss_case_prop() complete_case_prop() miss_case_pct() complete_case_pct()

Proportion of variables containing missings or complete values

miss_case_cumsum()

Summarise the missingness in each case

miss_case_summary()

Summarise the missingness in each case

miss_case_table()

Tabulate missings in cases.

miss_prop_summary()

Proportions of missings in data, variables, and cases.

miss_scan_count()

Search and present different kinds of missing values

miss_summary()

Collate summary measures from naniar into one tibble

miss_var_cumsum()

Cumulative sum of the number of missings in each variable

miss_var_run()

Find the number of missing and complete values in a single run

miss_var_span()

Summarise the number of missings for a given repeating span on a variable

miss_var_summary()

Summarise the missingness in each variable

miss_var_table()

Tabulate the missings in the variables

miss_var_which()

Which variables contain missing values?

Handy helpers

Handy helpers

n_var_complete() n_case_complete()

The number of variables with complete values

n_var_miss() n_case_miss()

The number of variables or cases with missing values

n_complete()

Return the number of complete values

n_complete_row()

Return a vector of the number of complete values in each row

n_miss()

Return the number of missing values

n_miss_row()

Return a vector of the number of missing values in each row

prop_miss_case() prop_complete_case()

Proportion of cases that contain a missing or complete values.

prop_miss_var() prop_complete_var()

Proportion of variables containing missings or complete values

prop_complete()

Return the proportion of complete values

prop_complete_row()

Return a vector of the proportion of missing values in each row

prop_miss()

Return the proportion of missing values

prop_miss_row()

Return a vector of the proportion of missing values in each row

pct_miss_case() pct_complete_case()

Percentage of cases that contain a missing or complete values.

pct_miss_var() pct_complete_var()

Percentage of variables containing missings or complete values

pct_complete()

Return the percent of complete values

pct_miss()

Return the percent of missing values

all_na() all_miss() all_complete()

Identify if all values are missing or complete

any_na() any_miss() any_complete()

Identify if there are any missing or complete values

any_row_miss()

Helper function to determine whether there are any missings

is_shade() are_shade() any_shade()

Detect if this is a shade

which_are_shade()

Which variables are shades?

common_na_numbers

Common number values for NA

common_na_strings

Common string values for NA

Add columns

Add missing data summaries/tool columns

add_any_miss()

Add a column describing presence of any missing values

add_label_missings()

Add a column describing if there are any missings in the dataset

add_label_shadow()

Add a column describing whether there is a shadow

add_miss_cluster()

Add a column that tells us which "missingness cluster" a row belongs to

add_n_miss()

Add column containing number of missing data values

add_prop_miss()

Add column containing proportion of missing data values

add_shadow()

Add a shadow column to dataframe

add_shadow_shift()

Add a shadow shifted column to a dataset

add_span_counter()

Add a counter variable for a span of dataframe

Replacing values with NA

Functions to help replace certain values with NA, which includes scoped variants (_at, _if, _all) that take formulas for flexible approachs

replace_with_na()

Replace values with missings

replace_with_na_all()

Replace all values with NA where a certain condition is met

replace_with_na_at()

Replace specified variables with NA where a certain condition is met

replace_with_na_if()

Replace values with NA based on some condition, for variables that meet some predicate

Imputation helpers

Simple imputation methods for exploring visualisation and missingness structure

impute_below()

Impute data with values shifted 10 percent below range.

impute_below_all()

Impute data with values shifted 10 percent below range.

impute_below_at()

Scoped variants of impute_below

impute_below_if()

Scoped variants of impute_below

impute_mean()

Impute the mean value into a vector with missing values

impute_median()

Impute the median value into a vector with missing values

impute_mean_all() impute_mean_at() impute_mean_if()

Scoped variants of impute_mean

impute_median_all() impute_median_at() impute_median_if()

Scoped variants of impute_median

Cast Shadows

Add shadow information to the dataframe while reducing it to the variables of interest

cast_shadow()

Add a shadow column to a dataset

cast_shadow_shift()

Add a shadow and a shadow_shift column to a dataset

cast_shadow_shift_label()

Add a shadow column and a shadow shifted column to a dataset

Misc helpers

Misc helpers

label_miss_1d()

Label a missing from one column

label_miss_2d()

label_miss_2d

label_missings()

Is there a missing value in the row of a dataframe?

where_na()

Which rows and cols contain missings?

which_na()

Which elements contain missings?