Exclude column in `dplyr` `mutate_at` while using data in this column

前端 未结 2 1146
攒了一身酷
攒了一身酷 2021-01-20 18:30

I want to rescale all variables (but year and gender) in a df by one specific year, grouped by gender:

相关标签:
2条回答
  • 2021-01-20 19:03

    Use position in mutate_at

    library(dplyr)
    
    df %>%
      group_by(gender) %>%
      mutate_at(-c(1, 2), ~ifelse(year == 3, 0, . - .[year == 3]))
    
    #  gender  year var_a var_b
    #   <fct>  <int> <dbl> <dbl>
    # 1 m          1    -2    -2
    # 2 m          2    -1    -1
    # 3 m          3     0     0
    # 4 m          4     1     1
    # 5 m          5     2     2
    # 6 f          1    -2    -2
    # 7 f          2    -1    -1
    # 8 f          3     0     0
    # 9 f          4     1     1
    #10 f          5     2     2
    

    In case, if you do not know the position of columns beforehand you can first find it

    cols <- which(names(df) %in% c("gender", "year"))
    
    df %>%
      group_by(gender) %>%
      mutate_at(-cols, ~ifelse(year == 3, 0, . - .[year == 3]))
    

    Or select columns which starts_with

    df %>%
      group_by(gender) %>%
      mutate_at(vars(starts_with("var")), ~ifelse(year == 3, 0, . - .[year == 3]))
    
    0 讨论(0)
  • 2021-01-20 19:06

    If you add a ~ before the function you should get the wanted output.

    library(dplyr)
    #> 
    #> Attaching package: 'dplyr'
    #> The following objects are masked from 'package:stats':
    #> 
    #>     filter, lag
    #> The following objects are masked from 'package:base':
    #> 
    #>     intersect, setdiff, setequal, union
    set.seed(1)
    df <- data.frame(gender = c(rep("m", 5),
                                rep("f", 5)), 
                     year = rep(1:5, 2), var_a = 1:10, var_b = 0:9)
    df
    #>    gender year var_a var_b
    #> 1       m    1     1     0
    #> 2       m    2     2     1
    #> 3       m    3     3     2
    #> 4       m    4     4     3
    #> 5       m    5     5     4
    #> 6       f    1     6     5
    #> 7       f    2     7     6
    #> 8       f    3     8     7
    #> 9       f    4     9     8
    #> 10      f    5    10     9
    
    df %>%
      group_by(gender) %>% 
      mutate_at(vars(-gender, -year),
                ~ifelse(year == 3, 0, . - .[year == 3]))
    #> # A tibble: 10 x 4
    #> # Groups:   gender [2]
    #>    gender  year var_a var_b
    #>    <fct>  <int> <dbl> <dbl>
    #>  1 m          1    -2    -2
    #>  2 m          2    -1    -1
    #>  3 m          3     0     0
    #>  4 m          4     1     1
    #>  5 m          5     2     2
    #>  6 f          1    -2    -2
    #>  7 f          2    -1    -1
    #>  8 f          3     0     0
    #>  9 f          4     1     1
    #> 10 f          5     2     2
    

    Created on 2019-04-29 by the reprex package (v0.2.1)

    EDIT: In older versions of dplyr you would use funs(), but it is soft deprecated as of dplyr 0.8.0

    df %>%
      group_by(gender) %>% 
      mutate_at(vars(-gender, -year),
                funs(ifelse(year == 3, 0, . - .[year == 3])))
    
    0 讨论(0)
提交回复
热议问题