Remove lines with only NAs from data.table

后端 未结 3 630
梦谈多话
梦谈多话 2020-12-22 06:36

I want to remove the lines from a data.table that only contain NAs.

> tab = data.table(A = c(1, NA, 3), B = c(NA, NA, 3))
> tab
    A  B
1:  1 NA
2: NA         


        
相关标签:
3条回答
  • 2020-12-22 07:19

    I quite like

    tab <- tab[sapply(1:nrow(tab), function(i){!all(is.na(tab[i,]))}),]
    

    It is intuitive to me, but I'm not sure it is the fastest approach.

    HTH

    0 讨论(0)
  • 2020-12-22 07:21

    We can use Reduce with is.na and &

    tab[!Reduce(`&`, lapply(tab, is.na))]
    #   A  B
    #1: 1 NA
    #2: 3  3
    

    Or a compact but not so efficient approach would be

    tab[rowSums(!is.na(tab)) != 0L]
    

    Also, as commented by @Frank, a join based approach,

    tab[!tab[NA_integer_], on = names(tab)]
    
    0 讨论(0)
  • 2020-12-22 07:39

    Another idea:

    library(dplyr)
    df %>% 
      filter(rowSums(is.na(.)) < length(.))
    
    0 讨论(0)
提交回复
热议问题