Counting observations according to the number of variables missing

前端 未结 2 618
鱼传尺愫
鱼传尺愫 2021-01-15 06:43

I would like to count the rows of a data frame, according to the number of variables that are missing. So for example in the data frame below I would like the code to return

相关标签:
2条回答
  • 2021-01-15 06:58
    table(rowSums(is.na(dt)))
    #0 1 2 3 
    #3 5 1 1 
    

    If you really need the last 0 (four NAs):

    tabulate(factor(rowSums(is.na(dt))), nbins = ncol(dt)+1)
    #[1] 3 5 1 1 0
    
    0 讨论(0)
  • 2021-01-15 07:00

    A more tideverse-y way of doing this is:

    library(tidyverse)
    
    dt <- structure(list(v1 = c(1, NA, 1 , 1, NA, NA, 1 , NA, 1, 1 ), 
                         v2 = c(1, NA, 1 , 1, 1 , 1 , 1 , 1 , 1, NA), 
                         v3 = c(1, 1 , NA, 1, 1 , 1 , 1 , 1 , 1, NA), 
                         v4 = c(1, 1 , 1 , 1, 1 ,  1, NA, 1 , 1, NA)
                         ),
                     .Names = c("v1", "v2", "v3", "v4"), row.names = c(NA, -10L), class = "data.frame")
    
    dt <- as_tibble(dt)
    

    Using drop_na(), tidyverse-y way of doing it

    dt %>% 
      drop_na()
    

    Filter only "complete cases/rows" using "stats" package

    dt %>% 
      filter(complete.cases(v1, v2, v3, v4))
    

    Using 'na.omit' (not a tidyverse function)

    dt %>% 
      na.omit()
    
    0 讨论(0)
提交回复
热议问题