Identify if there are any or all missing or complete values
Source:R/any-na-complete.R
any-all-na-complete.Rd
It is useful when exploring data to search for cases where there are any or all instances of missing or complete values. For example, these can help you identify and potentially remove or keep columns in a data frame that are all missing, or all complete.
For the any case, we provide two functions: any_miss
and
any_complete
. Note that any_miss
has an alias, any_na
. These both
under the hood call anyNA
. any_complete
is the complement to
any_miss
- it returns TRUE if there are any complete values. Note
that in a dataframe any_complete
will look for complete cases, which
are complete rows, which is different to complete variables.
For the all case, there are two functions: all_miss
, and
all_complete
.
Examples
# for vectors
misses <- c(NA, NA, NA)
complete <- c(1, 2, 3)
mixture <- c(NA, 1, NA)
all_na(misses)
#> [1] TRUE
all_na(complete)
#> [1] FALSE
all_na(mixture)
#> [1] FALSE
all_complete(misses)
#> [1] FALSE
all_complete(complete)
#> [1] TRUE
all_complete(mixture)
#> [1] FALSE
any_na(misses)
#> [1] TRUE
any_na(complete)
#> [1] FALSE
any_na(mixture)
#> [1] TRUE
# for data frames
all_na(airquality)
#> [1] FALSE
# an alias of all_na
all_miss(airquality)
#> [1] FALSE
all_complete(airquality)
#> [1] FALSE
any_na(airquality)
#> [1] TRUE
any_complete(airquality)
#> [1] TRUE
# use in identifying columns with all missing/complete
library(dplyr)
#>
#> Attaching package: ‘dplyr’
#> The following objects are masked from ‘package:stats’:
#>
#> filter, lag
#> The following objects are masked from ‘package:base’:
#>
#> intersect, setdiff, setequal, union
# for printing
aq <- as_tibble(airquality)
aq
#> # A tibble: 153 × 6
#> Ozone Solar.R Wind Temp Month Day
#> <int> <int> <dbl> <int> <int> <int>
#> 1 41 190 7.4 67 5 1
#> 2 36 118 8 72 5 2
#> 3 12 149 12.6 74 5 3
#> 4 18 313 11.5 62 5 4
#> 5 NA NA 14.3 56 5 5
#> 6 28 NA 14.9 66 5 6
#> 7 23 299 8.6 65 5 7
#> 8 19 99 13.8 59 5 8
#> 9 8 19 20.1 61 5 9
#> 10 NA 194 8.6 69 5 10
#> # ℹ 143 more rows
# select variables with all missing values
aq %>% select(where(all_na))
#> # A tibble: 153 × 0
# there are none!
#' # select columns with any NA values
aq %>% select(where(any_na))
#> # A tibble: 153 × 2
#> Ozone Solar.R
#> <int> <int>
#> 1 41 190
#> 2 36 118
#> 3 12 149
#> 4 18 313
#> 5 NA NA
#> 6 28 NA
#> 7 23 299
#> 8 19 99
#> 9 8 19
#> 10 NA 194
#> # ℹ 143 more rows
# select only columns with all complete data
aq %>% select(where(all_complete))
#> # A tibble: 153 × 4
#> Wind Temp Month Day
#> <dbl> <int> <int> <int>
#> 1 7.4 67 5 1
#> 2 8 72 5 2
#> 3 12.6 74 5 3
#> 4 11.5 62 5 4
#> 5 14.3 56 5 5
#> 6 14.9 66 5 6
#> 7 8.6 65 5 7
#> 8 13.8 59 5 8
#> 9 20.1 61 5 9
#> 10 8.6 69 5 10
#> # ℹ 143 more rows
# select columns where there are any complete cases (all the data)
aq %>% select(where(any_complete))
#> # A tibble: 153 × 6
#> Ozone Solar.R Wind Temp Month Day
#> <int> <int> <dbl> <int> <int> <int>
#> 1 41 190 7.4 67 5 1
#> 2 36 118 8 72 5 2
#> 3 12 149 12.6 74 5 3
#> 4 18 313 11.5 62 5 4
#> 5 NA NA 14.3 56 5 5
#> 6 28 NA 14.9 66 5 6
#> 7 23 299 8.6 65 5 7
#> 8 19 99 13.8 59 5 8
#> 9 8 19 20.1 61 5 9
#> 10 NA 194 8.6 69 5 10
#> # ℹ 143 more rows