Remove columns from dataframe where some of values are NA

前端 未结 7 1344
小鲜肉
小鲜肉 2020-11-28 08:32

I have a dataframe where some of the values are NA. I would like to remove these columns.

My data.frame looks like this

    v1   v2 
1    1   NA 
2           


        
相关标签:
7条回答
  • 2020-11-28 09:10

    A base R method related to the apply answers is

    Itun[!unlist(vapply(Itun, anyNA, logical(1)))]
      v1
    1  1
    2  1
    3  2
    4  1
    5  2
    6  1
    

    Here, vapply is used as we are operating on a list, and, apply, it does not coerce the object into a matrix. Also, since we know that the output will be logical vector of length 1, we can feed this to vapply and potentially get a little speed boost. For the same reason, I used anyNA instead of any(is.na()).

    0 讨论(0)
  • 2020-11-28 09:12

    You can use transpose twice:

    newdf <- t(na.omit(t(df)))
    
    0 讨论(0)
  • 2020-11-28 09:14

    Here's a convenient way to do it using the dplyr function select_if(). Combine not (!), any() and is.na(), which is equivalent to selecting all columns that don't contain any NA values.

    library(dplyr)
    Itun %>%
        select_if(~ !any(is.na(.)))
    
    0 讨论(0)
  • 2020-11-28 09:17

    Alternatively, select(where(~FUNCTION)) can be used:

    library(dplyr)
    
    (df <- data.frame(x = letters[1:5], y = NA, z = c(1:4, NA)))
    #>   x  y  z
    #> 1 a NA  1
    #> 2 b NA  2
    #> 3 c NA  3
    #> 4 d NA  4
    #> 5 e NA NA
    
    # Remove columns where all values are NA
    df %>% 
      select(where(~!all(is.na(.))))
    #>   x  z
    #> 1 a  1
    #> 2 b  2
    #> 3 c  3
    #> 4 d  4
    #> 5 e NA
      
    # Remove columns with at least one NA  
    df %>% 
      select(where(~!any(is.na(.))))
    #>   x
    #> 1 a
    #> 2 b
    #> 3 c
    #> 4 d
    #> 5 e
    
    0 讨论(0)
  • 2020-11-28 09:18

    The data:

    Itun <- data.frame(v1 = c(1,1,2,1,2,1), v2 = c(NA, 1, 2, 1, 2, NA)) 
    

    This will remove all columns containing at least one NA:

    Itun[ , colSums(is.na(Itun)) == 0]
    

    An alternative way is to use apply:

    Itun[ , apply(Itun, 2, function(x) !any(is.na(x)))]
    
    0 讨论(0)
  • 2020-11-28 09:26

    Another alternative with the dplyr package would be to make use of the Filter function

    Filter(function(x) !any(is.na(x)), Itun)
    

    with data.table would be a little more cumbersome

    setDT(Itun)[,.SD,.SDcols=setdiff((1:ncol(Itun)),
                                    which(colSums(is.na(Itun))>0))]
    
    0 讨论(0)
提交回复
热议问题