问题
I am trying to make multiple lags using the least amount of code possible in dplyr, while sticking to tidy eval. The following Standard Evaluation (SE) code works:
#if(!require(dplyr)) install.packages("dplyr");
library(dplyr)
a=as_tibble(c(1:100))
lags=3
lag_prefix=paste0("L", 1:lags, ".y")
multi_lag=setNames(paste("lag(.,", 1:lags, ")"), lag_prefix)
a %>% mutate_at(vars(value), funs_(multi_lag)) #final line
# A tibble: 100 x 4
value L1.y L2.y L3.y
<int> <int> <int> <int>
1 1 NA NA NA
2 2 1 NA NA
3 3 2 1 NA
4 4 3 2 1
5 5 4 3 2
6 6 5 4 3
7 7 6 5 4
8 8 7 6 5
9 9 8 7 6
10 10 9 8 7
# ... with 90 more rows
However, you'll notice that the final line does not use tidy eval, but resorts to SE. The package information regarding the funs_ command says it is superfluous due to tidy eval. Thus, I am wondering if it is possible to do this with tidy eval? Any help appreciated, I am a novice to evaluation types.
回答1:
From this blog post: multiple lags with tidy evaluation by Romain François
library(rlang)
library(tidyverse)
a <- as_tibble(c(1:100))
n_lags <- 3
lags <- function(var, n = 3) {
var <- enquo(var)
indices <- seq_len(n)
# create a list of quosures by looping over `indices`
# then give them names for `mutate` to use later
map(indices, ~ quo(lag(!!var, !!.x))) %>%
set_names(sprintf("L_%02d.%s", indices, "y"))
}
# unquote the list of quosures so that they are evaluated by `mutate`
a %>%
mutate_at(vars(value), funs(!!!lags(value, n_lags)))
#> # A tibble: 100 x 4
#> value L_01.y L_02.y L_03.y
#> <int> <int> <int> <int>
#> 1 1 NA NA NA
#> 2 2 1 NA NA
#> 3 3 2 1 NA
#> 4 4 3 2 1
#> 5 5 4 3 2
#> 6 6 5 4 3
#> 7 7 6 5 4
#> 8 8 7 6 5
#> 9 9 8 7 6
#> 10 10 9 8 7
#> # ... with 90 more rows
Created on 2019-02-15 by the reprex package (v0.2.1.9000)
回答2:
Inpired by the answer by @Tung I tried to make more generic function that looks more like the tidyr functions rather than dplyr functions, i.e outside mutate.
# lags function
lags <- function(data, var, nlags) {
var <- enquos(var)
data %>%
bind_cols(
map_dfc(seq_len(n),
function(x) {
new_var <- sprintf("L_%02d.%s", x, "y")
data %>% transmute(new_var := lag(!!!var, x))
}
))
}
# Apply function to data frame
a <- as_tibble(c(1:100))
a %>%
lags(value, 3)
来源:https://stackoverflow.com/questions/54720090/dplyr-multiple-lags-tidy-eval