filtering data frame based on NA on multiple columns

后端 未结 4 339
南笙
南笙 2020-12-17 20:21

I have the following data frame lets call it df, with the following observations:

id   type   company
1    NA      NA
2    NA      ADM
3    Nort         


        
相关标签:
4条回答
  • 2020-12-17 20:54

    You need AND operator (&), not OR (|) I also strongly suggest the tidyverse approach by using the dplyr function filter() and the pipe operator %>%, from dplyr as well:

    library(dplyr)
    df_not_na <- df %>% filter(!is.na(company) & !is.na(type))
    
    0 讨论(0)
  • 2020-12-17 20:57

    Using dplyr, you can also use the filter_at function

    library(dplyr)
    df_non_na <- df %>% filter_at(vars(type,company),all_vars(!is.na(.)))
    

    all_vars(!is.na(.)) means that all the variables listed need to be not NA.

    If you want to keep rows that have at least one value, you could do:

    df_non_na <- df %>% filter_at(vars(type,company),any_vars(!is.na(.)))
    
    0 讨论(0)
  • 2020-12-17 20:59

    We can get the logical index for both columns, use & and subset the rows.

    df1[!is.na(df1$type) & !is.na(df1$company),]
    # id  type company
    #3  3 North    Alex
    #5 NA North     BDA
    

    Or use rowSums on the logical matrix (is.na(df1[-1])) to subset.

    df1[!rowSums(is.na(df1[-1])),]
    
    0 讨论(0)
  • 2020-12-17 20:59

    you can use

    na.omit(data_frame_name)
    
    0 讨论(0)
提交回复
热议问题