Why does dplyr recode generate error when recoding to NA but not NaN

风格不统一 提交于 2020-12-12 14:37:51

问题


I'm recoding with dplyr. I'm getting an error when I recode a value to NA, but not NaN. Here's an example:

df <- df %>% mutate(var=recode(var,`2`=0,`3`=NaN))

Works fine, whereas

df <- df %>% mutate(var=recode(var,`2`=0,`3`=NA))

gives me the following error:

Error: Vector 2 must be a double vector, not a logical vector

回答1:


When running the code you get this error

tibble(var = rep(2:3, 4)) %>% 
 mutate(var=recode(var,`2`=0,`3`=NA)) 
# Error: Vector 2 must be a double vector, not a logical vector

This is because NA is logical, but recode is expecting a double

class(NA)
# [1] "logical"

You can use NA_real_ instead, since that's a double

class(NA_real_)
# [1] "numeric"
is.double(NA_real_)
# [1] TRUE

tibble(var = rep(2:3, 4)) %>% 
 mutate(var=recode(var,`2`=0,`3`=NA_real_)) 
#     var
#   <dbl>
# 1     0
# 2    NA
# 3     0
# 4    NA
# 5     0
# 6    NA
# 7     0
# 8    NA

For why it's expecting a double, see ?recode

All replacements must be the same type, and must have either length one or the same length as .x.

I think the reason this is unexpected is because base functions like c don't care if the elements are of the same type and will just convert upwards anyway. So this works:

c(1, NA, 3)

Because for the c function:

The output type is determined from the highest type of the components in the hierarchy NULL < raw < logical < integer < double < complex < character < list < expression




回答2:


An option to change a specific value to NA is na_if

library(dplyr)
df %>% 
   mutate(var = na_if(var, 3))

With recode, @IceCreamToucan's answer is great, but if we want to change it automatically between integer/numeric, we can still do it based on the property of NA in multiplication (to return NA, but it would change the type automatically)

df %>% 
    mutate(var = recode(var,`2`=0,`3`=NA* var[!is.na(var)][1]))
#    var
#1   0
#2  NA
#3   4
#4   5
#5  NA

It can be other functions as well which return NA

df %>%
      mutate(var = recode(var,`2`=0,`3`= max(var[1], NA)))

data

df <- data.frame(var = c(2, 3, 4, 5, 3))


来源:https://stackoverflow.com/questions/57512107/why-does-dplyr-recode-generate-error-when-recoding-to-na-but-not-nan

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!