Efficient method to subset drop rows with NA values in R

后端 未结 3 1983
一整个雨季
一整个雨季 2021-02-03 12:51

Background Before running a stepwise model selection, I need to remove missing values for any of my model terms. With quite a few terms in my model, there are t

3条回答
  •  情话喂你
    2021-02-03 13:53

    Edit: I completely glossed over subset, the built in function that is made for sub-setting things:

    my.df <- subset(my.df, 
      !(is.na(termA) |
        is.na(termB) |
        is.na(termC) )
      )
    

    I tend to use with() for things like this. Don't use attach, you're bound to cut yourself.

    my.df <- my.df[with(my.df, {
      !(is.na(termA) |
        is.na(termB) |
        is.na(termC) )
    }), ]
    

    But if you often do this, you might also want a helper function, is_any()

    is_any <- function(x){
      !is.na(x)
    }
    

    If you end up doing a lot of this sort of thing, using SQL is often going to be a nicer interaction with subsets of data. dplyr may also prove useful.

提交回复
热议问题