## Data structures for missing data

Creation and Manipulation of Shadow Matrices

as_shadow()

as_shadow(<data.frame>)

as_shadow_upset()

Convert data into shadow format for doing an upset plot

bind_shadow()

Bind a shadow dataframe to original data

gather_shadow()

Long form representation of a shadow matrix

shadow_shift()

Shift missing values to facilitate missing data exploration/visualisation

shadow_shift(<numeric>)

Shift (impute) numeric values for graphical exploration

unbind_shadow() unbind_data()

Unbind (remove) shadow from data, and vice versa

## Visualisation

Visualise missing data

geom_miss_point()

geom_miss_point

gg_miss_case()

Plot the number of missings per case (row)

gg_miss_case_cumsum()

Plot of cumulative sum of missing for cases

gg_miss_fct()

Plot the number of missings for each variable, broken down by a factor

gg_miss_span()

Plot the number of missings in a given repeating span

gg_miss_upset()

Plot the pattern of missingness using an upset plot.

gg_miss_var()

Plot the number of missings for each variable

gg_miss_var_cumsum()

Plot of cumulative sum of missing value for each variable

gg_miss_which()

Plot which variables contain a missing value

reexports

Objects exported from other packages

## Numerical Summaries

Provide tidy data frame summaries of missingness

miss_case_pct() complete_case_pct()

Percentage of cases that contain a missing or complete values.

miss_case_prop() complete_case_prop()

Proportion of cases that contain a missing or complete values.

miss_var_pct() complete_var_pct()

Percentage of variables containing missings or complete values

miss_var_prop() complete_var_prop()

Proportion of variables containing missings or complete values

miss_case_summary()

Summarise the missingness in each case

miss_case_table()

Tabulate missings in cases.

miss_prop_summary()

Proportions of missings in data, variables, and cases.

miss_scan_count()

Search and present different kinds of missing values

miss_summary()

Collate summary measures from naniar into one tibble

miss_var_run()

Find the number of missing and complete values in a single run

miss_var_span()

Summarise the number of missings for a given repeating span on a variable

miss_var_summary()

Summarise the missingness in each variable

miss_var_table()

Tabulate the missings in the variables

miss_var_which()

Which variables contain missing values?

## Handy helpers

Handy helpers

n_var_complete() n_case_complete()

The number of variables with complete values

n_var_miss() n_case_miss()

The number of variables or cases with missing values

n_complete()

Return the number of complete values

n_complete_row()

Return a vector of the number of complete values in each row

n_miss()

Return the number of missing values

n_miss_row()

Return a vector of the number of missing values in each row

prop_complete()

Return the proportion of complete values

prop_complete_row()

Return a vector of the proportion of missing values in each row

prop_miss()

Return the proportion of missing values

prop_miss_row()

Return a vector of the proportion of missing values in each row

pct_complete()

Return the percent of complete values

pct_miss()

Return the percent of missing values

all_na() all_miss() all_complete()

Identify if all values are missing or complete

all_row_complete()

Helper function to determine whether all rows are complete

all_row_miss()

Helper function to determine whether all rows are missing

any_na() any_miss() any_complete()

Identify if there are any missing or complete values

any_row_miss()

Helper function to determine whether there are any missings

is_shadow()

are_shadow()

common_na_numbers

Common number values for NA

common_na_strings

Common string values for NA

add_any_miss()

Add a column describing presence of any missing values

add_label_missings()

Add a column describing if there are any missings in the dataset

add_label_shadow()

add_miss_cluster()

Add a column that tells us which "missingness cluster" a row belongs to

add_n_miss()

Add column containing number of missing data values

add_prop_miss()

Add column containing proportion of missing data values

add_shadow()

add_shadow_shift()

add_span_counter()

Add a counter variable for a span of dataframe

## Replacing values with NA

Functions to help replace certain values with NA, which includes scoped variants (_at, _if, _all) that take formulas for flexible approachs

replace_with_na()

Replace values with missings

replace_with_na_all()

Replace all values with NA where a certain condition is met

replace_with_na_at()

Replace specified variables with NA where a certain condition is met

replace_with_na_if()

Replace values with NA based on some condition, for variables that meet some predicate

## Imputation helpers

Simple imputation methods for exploring visualisation and missingness structure

impute_below()

Impute data with values shifted 10% below range.

impute_below_at()

Scoped variants of impute_below

impute_below_if()

Scoped variants of impute_below

impute_mean()

Impute the mean value into a vector with missing values

impute_mean_all() impute_mean_at() impute_mean_if()

Scoped variants of impute_mean

Add shadow information to the dataframe while reducing it to the variables of interest

cast_shadow()

cast_shadow_shift()

cast_shadow_shift_label()

## Misc helpers

Misc helpers

label_miss_1d()

Label a missing from one column

label_miss_2d()

label_miss_2d

label_missings()

Is there a missing value in the row of a dataframe?

where_na()

Which rows and cols contain missings?

which_na()

Which elements contain missings?