I would like to count the rows of a data frame, according to the number of variables that are missing. So for example in the data frame below I would like the code to return
table(rowSums(is.na(dt)))
#0 1 2 3
#3 5 1 1
If you really need the last 0 (four NA
s):
tabulate(factor(rowSums(is.na(dt))), nbins = ncol(dt)+1)
#[1] 3 5 1 1 0
A more tideverse-y way of doing this is:
library(tidyverse)
dt <- structure(list(v1 = c(1, NA, 1 , 1, NA, NA, 1 , NA, 1, 1 ),
v2 = c(1, NA, 1 , 1, 1 , 1 , 1 , 1 , 1, NA),
v3 = c(1, 1 , NA, 1, 1 , 1 , 1 , 1 , 1, NA),
v4 = c(1, 1 , 1 , 1, 1 , 1, NA, 1 , 1, NA)
),
.Names = c("v1", "v2", "v3", "v4"), row.names = c(NA, -10L), class = "data.frame")
dt <- as_tibble(dt)
Using drop_na(), tidyverse-y way of doing it
dt %>%
drop_na()
Filter only "complete cases/rows" using "stats" package
dt %>%
filter(complete.cases(v1, v2, v3, v4))
Using 'na.omit' (not a tidyverse function)
dt %>%
na.omit()