dplyr::group_by_ with character string input of several variable names

前端 未结 2 1189
隐瞒了意图╮
隐瞒了意图╮ 2020-11-28 06:36

I\'m writing a function where the user is asked to define one or more grouping variables in the function call. The data is then grouped using dplyr and it works as expected

相关标签:
2条回答
  • 2020-11-28 07:11

    slice_rows() from the purrrlyr package (https://github.com/hadley/purrrlyr) groups a data.frame by taking a vector of column names (strings) or positions (integers):

    y <- c("cyl", "gear")
    mtcars_grp <- mtcars %>% purrrlyr::slice_rows(y)
    
    class(mtcars_grp)
    #> [1] "grouped_df" "tbl_df"     "tbl"        "data.frame"
    
    group_vars(mtcars_grp)
    #> [1] "cyl"  "gear"
    

    Particularly useful now that group_by_() has been depreciated.

    0 讨论(0)
  • 2020-11-28 07:25

    No need for interp here, just use as.formula to convert the strings to formulas:

    dots = sapply(y, . %>% {as.formula(paste0('~', .))})
    mtcars %>% group_by_(.dots = dots)
    

    The reason why your interp approach doesn’t work is that the expression gives you back the following:

    ~list(c("cyl", "gear"))
    

    – not what you want. You could, of course, sapply interp over y, which would be similar to using as.formula above:

    dots1 = sapply(y, . %>% {interp(~var, var = .)})
    

    But, in fact, you can also directly pass y:

    mtcars %>% group_by_(.dots = y)
    

    The dplyr vignette on non-standard evaluation goes into more detail and explains the difference between these approaches.

    0 讨论(0)
提交回复
热议问题