问题
Given the dplyr workflow:
require(dplyr)
mtcars %>%
tibble::rownames_to_column(var = "model") %>%
filter(grepl(x = model, pattern = "Merc")) %>%
group_by(am) %>%
summarise(meanMPG = mean(mpg))
I'm interested in conditionally applying filter
depending on the value of applyFilter
.
Solution
For applyFilter <- 1
the rows are filtered with use of the "Merc"
string, without the filter all rows are returned.
applyFilter <- 1
mtcars %>%
tibble::rownames_to_column(var = "model") %>%
filter(model %in%
if (applyFilter) {
rownames(mtcars)[grepl(x = rownames(mtcars), pattern = "Merc")]
} else
{
rownames(mtcars)
}) %>%
group_by(am) %>%
summarise(meanMPG = mean(mpg))
Problem
The suggested solution is inefficient as the ifelse
call is always evaluated; a more desireable approach would only evaluate the filter
step for applyFilter <- 1
.
Attempt
The inefficient working solution would look like that:
mtcars %>%
tibble::rownames_to_column(var = "model") %>%
# Only apply filter step if condition is met
if (applyFilter) {
filter(grepl(x = model, pattern = "Merc"))
}
%>%
# Continue
group_by(am) %>%
summarise(meanMPG = mean(mpg))
Naturally, the syntax above is incorrect. It's only a illustration how the ideal workflow should look.
Desired answer
I'm not interested in creating an interim object; the workflow should resemble:
startingObject %>% ... conditional filter ... final object
Ideally, I would like to arrive at solution where I can control whether the
filter
call is being evaluated or not
回答1:
How about this approach:
mtcars %>%
tibble::rownames_to_column(var = "model") %>%
filter(if(applyfilter== 1) grepl(x = model, pattern = "Merc") else TRUE) %>%
group_by(am) %>%
summarise(meanMPG = mean(mpg))
This means grepl
is only evaluated if the applyfilter is 1, otherwise the filter
simply recycles a TRUE
.
Or another option is to use {}
:
mtcars %>%
tibble::rownames_to_column(var = "model") %>%
{if(applyfilter == 1) filter(., grepl(x = model, pattern = "Merc")) else .} %>%
group_by(am) %>%
summarise(meanMPG = mean(mpg))
There's obviously another possible approach in which you would simply break the pipe, conditionally do the filter and then continue the pipe (I know OP didn't ask for this, just want to give another example for other readers)
mtcars %<>%
tibble::rownames_to_column(var = "model")
if(applyfilter == 1) mtcars %<>% filter(grepl(x = model, pattern = "Merc"))
mtcars %>%
group_by(am) %>%
summarise(meanMPG = mean(mpg))
来源:https://stackoverflow.com/questions/44001722/conditionally-apply-pipeline-step-depending-on-external-value