This vector contains common values of NA (missing), which is aimed to
be used inside naniar functions miss_scan_count()
and
replace_with_na()
. The current list of
strings used can be found by printing out common_na_strings
. It is a
useful way to explore your data for possible missings, but I strongly warn
against using this to replace NA values without very carefully looking at
the incidence for each of the cases. Please note that common_na_strings
uses \\
around the "?", "." and "*" characters to protect against using
their wildcard features in grep. Common NA numbers are in the data object
common_na_numbers
.
Note
original discussion here https://github.com/njtierney/naniar/issues/168
Examples
dat_ms <- tibble::tribble(~x, ~y, ~z,
1, "A", -100,
3, "N/A", -99,
NA, NA, -98,
-99, "E", -101,
-98, "F", -1)
miss_scan_count(dat_ms, -99)
#> # A tibble: 3 × 2
#> Variable n
#> <chr> <int>
#> 1 x 1
#> 2 y 0
#> 3 z 1
miss_scan_count(dat_ms, c("-99","-98","N/A"))
#> # A tibble: 3 × 2
#> Variable n
#> <chr> <int>
#> 1 x 2
#> 2 y 1
#> 3 z 2
common_na_strings
#> [1] "missing" "NA" "N A" "N/A" "#N/A" "NA " " NA"
#> [8] "N /A" "N / A" " N / A" "N / A " "na" "n a" "n/a"
#> [15] "na " " na" "n /a" "n / a" " a / a" "n / a " "NULL"
#> [22] "null" "" "\\?" "\\*" "\\."
miss_scan_count(dat_ms, common_na_strings)
#> # A tibble: 3 × 2
#> Variable n
#> <chr> <int>
#> 1 x 4
#> 2 y 4
#> 3 z 5
replace_with_na(dat_ms, replace = list(y = common_na_strings))
#> # A tibble: 5 × 3
#> x y z
#> <dbl> <chr> <dbl>
#> 1 1 A -100
#> 2 3 NA -99
#> 3 NA NA -98
#> 4 -99 E -101
#> 5 -98 F -1