dplyr::group_by_ with character string input of several variable names

前端未结

关注

 2  1191

I\'m writing a function where the user is asked to define one or more grouping variables in the function call. The data is then grouped using dplyr and it works as expected

相关标签:

2条回答

离开以前

2020-11-28 07:11
slice_rows() from the purrrlyr package (https://github.com/hadley/purrrlyr) groups a data.frame by taking a vector of column names (strings) or positions (integers):
```
y <- c("cyl", "gear")
mtcars_grp <- mtcars %>% purrrlyr::slice_rows(y)

class(mtcars_grp)
#> [1] "grouped_df" "tbl_df"     "tbl"        "data.frame"

group_vars(mtcars_grp)
#> [1] "cyl"  "gear"
```
Particularly useful now that group_by_() has been depreciated.
0 讨论(0)
发布评论:

提交评论
- 加载中...
旧巷少年郎

2020-11-28 07:25
No need for interp here, just use as.formula to convert the strings to formulas:
```
dots = sapply(y, . %>% {as.formula(paste0('~', .))})
mtcars %>% group_by_(.dots = dots)
```
The reason why your interp approach doesn’t work is that the expression gives you back the following:
```
~list(c("cyl", "gear"))
```
– not what you want. You could, of course, sapply interp over y, which would be similar to using as.formula above:
```
dots1 = sapply(y, . %>% {interp(~var, var = .)})
```
But, in fact, you can also directly pass y:
```
mtcars %>% group_by_(.dots = y)
```
The dplyr vignette on non-standard evaluation goes into more detail and explains the difference between these approaches.
0 讨论(0)
发布评论:

提交评论
- 加载中...