R data.table remove rows where one column is duplicated if another column is NA

后端 未结 3 1654
伪装坚强ぢ
伪装坚强ぢ 2021-01-26 20:30

Here is an example data.table

dt <- data.table(col1 = c(\'A\', \'A\', \'B\', \'C\', \'C\', \'D\'), col2 = c(NA, \'dog\', \'cat\', \'jeep\', \'porsch\', NA))

         


        
3条回答
  •  余生分开走
    2021-01-26 20:53

    You missed the parenthesis (maybe a typo), I suppose it should be length(col1) > 1; And also used ifelse on a scalar condition which will not work as you expect it to (only the first element from the vector is picked up); If you want to remove NA values from a group when there are non NAs, you can use if/else:

    dt[, .(col2 = if(all(is.na(col2))) NA_character_ else na.omit(col2)), by = col1]
    
    #   col1   col2
    #1:    A    dog
    #2:    B    cat
    #3:    C   jeep
    #4:    C porsch
    #5:    D     NA
    

提交回复
热议问题