Access the column names in the `mutate_at` to use it for subseting a list

拟墨画扇 提交于 2021-02-09 13:57:29

问题


I am trying to recode several variables but with different recode schemes. The recoding scheme is saved in a list where each element is a named vector of the form old = new. Each element is the recoding scheme for each variable in the data frame

I am using the mutate_at function and the recode.

I think that the problem is that I cannot extract the variable name to use it to get the correct recoding scheme from the list

I tried deparse(substitute(.)) as in here and also this didn;t help

Also I saw here that I can extract the column name of the variable that is passed with tidyevalution but I again failed to implement it. (also it is using the deprecated 'funs`)

Last, I am hoping that this is the correct approach to recode the variables (i.e. using this recode list inside the mutate). If there is totally different way to approach this multiple recoding please let me know

library(dplyr)
# dplyr version 0.8.5

df <- 
  tibble(
    var1 = c("A", "A", "B", "C"),
    var2 = c("X", "Y", "Z", "Z")
  )

recode_list <- 
  list(

    var1 = c(A = 1, B = 2, C = 3),
    var2 = c(X = 0, Y = -1, Z = 1)
  )

recode_list
#> $var1
#> A B C 
#> 1 2 3 
#> 
#> $var2
#>  X  Y  Z 
#>  0 -1  1

I am using the dplyr::recode function.


# recoding works fine when doing it one variable as a time
df %>% 
  mutate(
    var1 = recode(var1, !!!recode_list[["var1"]]),
    var2 = recode(var2, !!!recode_list[["var2"]])
  )
#> # A tibble: 4 x 2
#>    var1  var2
#>   <dbl> <dbl>
#> 1     1     0
#> 2     1    -1
#> 3     2     1
#> 4     3     1

When I try to apply a function to do this for all variables, it seems to fail

# this does not work.
df %>%
  mutate_at(vars(var1, var2), ~{

    var_name <- rlang::quo_name(quo(.))

    recode(., !!!recode_list[[var_name]])
  }
  )
#> Error in expr_interp(f): object 'var_name' not found

I also tried rlang::as_name and rlang::as_label but I think I cannot really capture the name of the variable as a string to use it to subset the recode_list.


df %>%
  mutate_at(vars(var1, var2), ~ {
    var_name <- rlang::as_name(quo(.))
    print(var_name)
    #recode(., !!!recode_list[["var2"]])
  }
  )
#> [1] "."
#> [1] "."
#> # A tibble: 4 x 2
#>   var1  var2 
#>   <chr> <chr>
#> 1 .     .    
#> 2 .     .    
#> 3 .     .    
#> 4 .     .


Created on 2020-04-30 by the reprex package (v0.3.0)

回答1:


Does this work for you?

library(dplyr)
library(rlang)
df %>% 
  mutate_at(vars(var1,var2),
            .funs = function(x){recode_list %<>% .[[as_label(enquo(x))]]
            recode(x,!!!recode_list)})
## A tibble: 4 x 2
#   var1  var2
#  <dbl> <dbl>
#1     1     0
#2     1    -1
#3     2     1
#4     3     1

I suspect this works while placing the subset recode_list directly into recode does not is because enquo delays evaluation of x until assignment with %<>%. Then !!! can force evaluation after it has been properly evaluated previously.

Edit

Your approach with rlang also works with some modifications:

library(rlang)
df %>%
  mutate_at(vars(var1, var2), function(x) {
    var_name <- rlang::as_label(substitute(x))
    recode(x, !!!recode_list[[var_name]])
  })


来源:https://stackoverflow.com/questions/61523202/access-the-column-names-in-the-mutate-at-to-use-it-for-subseting-a-list

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!