Conditionally apply pipeline step depending on external value

前提是你 提交于 2020-01-10 14:45:31

问题


Given the dplyr workflow:

require(dplyr)                                      
mtcars %>% 
    tibble::rownames_to_column(var = "model") %>% 
    filter(grepl(x = model, pattern = "Merc")) %>% 
    group_by(am) %>% 
    summarise(meanMPG = mean(mpg))

I'm interested in conditionally applying filter depending on the value of applyFilter.

Solution

For applyFilter <- 1 the rows are filtered with use of the "Merc" string, without the filter all rows are returned.

applyFilter <- 1


mtcars %>%
  tibble::rownames_to_column(var = "model") %>%
  filter(model %in%
           if (applyFilter) {
             rownames(mtcars)[grepl(x = rownames(mtcars), pattern = "Merc")]
           } else
           {
             rownames(mtcars)
           }) %>%
  group_by(am) %>%
  summarise(meanMPG = mean(mpg))

Problem

The suggested solution is inefficient as the ifelse call is always evaluated; a more desireable approach would only evaluate the filter step for applyFilter <- 1.

Attempt

The inefficient working solution would look like that:

mtcars %>% 
    tibble::rownames_to_column(var = "model") %>% 
    # Only apply filter step if condition is met
    if (applyFilter) { 
        filter(grepl(x = model, pattern = "Merc"))
        }
    %>% 
    # Continue 
    group_by(am) %>% 
    summarise(meanMPG = mean(mpg))

Naturally, the syntax above is incorrect. It's only a illustration how the ideal workflow should look.


Desired answer

  • I'm not interested in creating an interim object; the workflow should resemble:

    startingObject
        %>%
        ...
        conditional filter
        ...
        final object
    
  • Ideally, I would like to arrive at solution where I can control whether the filter call is being evaluated or not


回答1:


How about this approach:

mtcars %>% 
    tibble::rownames_to_column(var = "model") %>% 
    filter(if(applyfilter== 1) grepl(x = model, pattern = "Merc") else TRUE) %>% 
    group_by(am) %>% 
    summarise(meanMPG = mean(mpg))

This means grepl is only evaluated if the applyfilter is 1, otherwise the filter simply recycles a TRUE.


Or another option is to use {}:

mtcars %>% 
  tibble::rownames_to_column(var = "model") %>% 
  {if(applyfilter == 1) filter(., grepl(x = model, pattern = "Merc")) else .} %>% 
  group_by(am) %>% 
  summarise(meanMPG = mean(mpg))

There's obviously another possible approach in which you would simply break the pipe, conditionally do the filter and then continue the pipe (I know OP didn't ask for this, just want to give another example for other readers)

mtcars %<>% 
  tibble::rownames_to_column(var = "model")

if(applyfilter == 1) mtcars %<>% filter(grepl(x = model, pattern = "Merc"))

mtcars %>% 
  group_by(am) %>% 
  summarise(meanMPG = mean(mpg))


来源:https://stackoverflow.com/questions/44001722/conditionally-apply-pipeline-step-depending-on-external-value

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!