Recode multiple columns using dplyr

人走茶凉 提交于 2019-12-23 13:08:52

问题


I had a dataframe where I recoded several columns so that 999 was set to NA

dfB <-dfA %>%
  mutate(adhere = if_else(adhere==999, as.numeric(NA), adhere)) %>%
  mutate(engage = if_else(engage==999, as.numeric(NA), engage)) %>%
  mutate(quality = if_else(quality==999, as.numeric(NA), quality)) %>%
  mutate(undrstnd = if_else(undrstnd==999, as.numeric(NA), undrstnd)) %>%
  mutate(sesspart = if_else(sesspart==999, as.numeric(NA), sesspart)) %>%
  mutate(attended = if_else(attended>=9, as.integer(NA), attended))

I want to use mutate_at() and a range of columns and recode() instead of if_else(), but I am stuck on how to give it the condition. I think something like 999 = NA based on some mutate_all examples -- but I also need the NA to match the type of .x and I am unsure how to get it to be type sensitive

I tried:

y <- data.frame(y1=c(1,2,999,3,4), y2=c(1L, 2L, 999L, 3L, 4L), y3=c(T,T,F,F,T))
z <- y %>%
    mutate_at( vars(y1:y2), funs(recode(.,`999` = as.numeric(NA))))

But I get a warning "Unreplaced values treated as NA as .x is not compatible. Please specify replacements exhaustively or supply .default " and I can see that it worded for the numeric column, but not for the integer column y2"

> z
  y1 y2    y3
1  1 NA  TRUE
2  2 NA  TRUE
3 NA NA FALSE
4  3 NA FALSE
5  4 NA  TRUE

回答1:


I'm having trouble understanding exactly what you want to accomplish, so let me know if this isn't quite it.


library(tidyverse)

y <- data.frame(y1=c(1,2,999,3,4), y2=c(1L, 2L, 999L, 3L, 4L), y3=c(T,T,F,F,T))

y

#>    y1  y2    y3
#> 1   1   1  TRUE
#> 2   2   2  TRUE
#> 3 999 999 FALSE
#> 4   3   3 FALSE
#> 5   4   4  TRUE

z <- y %>%
  mutate_at(.vars = vars(y1:y2), 
            .funs = ~ ifelse(. == 999, NA, .))

z

#>   y1 y2    y3
#> 1  1  1  TRUE
#> 2  2  2  TRUE
#> 3 NA NA FALSE
#> 4  3  3 FALSE
#> 5  4  4  TRUE



回答2:


I think it is related the column type. I added mutate_if to convert all integer columns to numeric, and then set the recode value to be NA_real_. It seems working.

library(dplyr)

y <- data.frame(y1=c(1,2,999,3,4), y2=c(1L, 2L, 999L, 3L, 4L), y3=c(T,T,F,F,T))

z <- y %>%
  mutate_if(is.integer, as.numeric) %>%
  mutate_at(vars(y1:y2), funs(recode(.,`999` = NA_real_)))
z
#   y1 y2    y3
# 1  1  1  TRUE
# 2  2  2  TRUE
# 3 NA NA FALSE
# 4  3  3 FALSE
# 5  4  4  TRUE



回答3:


Now that funs has been depreciated in dplyr, here's the new way to go:

z <- y %>%
  mutate_if(is.integer, as.numeric) %>%
  mutate_at(vars(y1:y2), list(~recode(.,`999` = NA_real_)))

Replace funs with list and insert a ~ before recode.



来源:https://stackoverflow.com/questions/47521920/recode-multiple-columns-using-dplyr

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!