Counting observations according to the number of variables missing

前端未结

关注

 2  621

I would like to count the rows of a data frame, according to the number of variables that are missing. So for example in the data frame below I would like the code to return

相关标签:

2条回答

孤城傲影

2021-01-15 06:58

table(rowSums(is.na(dt)))
#0 1 2 3 
#3 5 1 1

If you really need the last 0 (four NAs):

tabulate(factor(rowSums(is.na(dt))), nbins = ncol(dt)+1)
#[1] 3 5 1 1 0

0 讨论(0)

后悔当初

2021-01-15 07:00

A more tideverse-y way of doing this is:

library(tidyverse)

dt <- structure(list(v1 = c(1, NA, 1 , 1, NA, NA, 1 , NA, 1, 1 ), 
                     v2 = c(1, NA, 1 , 1, 1 , 1 , 1 , 1 , 1, NA), 
                     v3 = c(1, 1 , NA, 1, 1 , 1 , 1 , 1 , 1, NA), 
                     v4 = c(1, 1 , 1 , 1, 1 ,  1, NA, 1 , 1, NA)
                     ),
                 .Names = c("v1", "v2", "v3", "v4"), row.names = c(NA, -10L), class = "data.frame")

dt <- as_tibble(dt)

Using drop_na(), tidyverse-y way of doing it

dt %>% 
  drop_na()

Filter only "complete cases/rows" using "stats" package

dt %>% 
  filter(complete.cases(v1, v2, v3, v4))

Using 'na.omit' (not a tidyverse function)

dt %>% 
  na.omit()

0 讨论(0)