How to store filter expressions as strings?

后端 未结 2 1365
天涯浪人
天涯浪人 2020-12-21 06:51

For the analysis of a species database, I often need to change lots of criteria, depending on the projects scope etc.

As it is very inconvenient to always change the

相关标签:
2条回答
  • 2020-12-21 07:19

    In addition to @Konrad's methods, if the expression is a string, then we can use parse_expr from rlang

    library(rlang)
    library(dplyr)
    df1 %>% 
        filter(!! parse_expr(expr1))
    #   col_A col_B
    #1     A     1
    

    data

     df1 <- data.frame(col_A = LETTERS[1:10],
               col_B = 1:10,
               stringsAsFactors = FALSE)
    
    expr1 <-  "col_A == 'A' & col_B == 1"
    
    0 讨论(0)
  • 2020-12-21 07:37

    filter_

    You can pass your filter expression using filter_ in dplyr:

    mtcars %>%
        filter_("cyl == 4")
    

    Handling strings

    Let's say that you want to take this further and handle strings, you could use '' for your string in the filter:

    data.frame(col_A = LETTERS[1:10],
               col_B = 1:10,
               stringsAsFactors = FALSE) %>%
        filter_("col_A == 'A'")
    

    Handling "

    If you really want to pass your string as ", you have to escape quotes:

    data.frame(col_A = LETTERS[1:10],
               col_B = 1:10,
               stringsAsFactors = FALSE) %>%
        filter_("col_A == \"A\"")
    

    Better approach

    I would suggest that you avoid the approach above. Have a look at the suggestion below that let's you pass your column name using sym function. In dplyr pipeline you could make use of rlang that would give you more flexibility in building your filter expressions:

    require(dplyr)
    require(rlang)
    col_nme <- sym("cyl")
    flt_val <- 4
    mtcars %>%
        filter(UQ(col_nme) == UQ(flt_val))
    

    This is equivalent to:

    mtcars %>%
        filter(UQ(col_nme) == flt_val)
    

    As you don't have to unquote second argument.

    Side points

    The syntax of your filter is:

    rlb == "1" | rlb == "2" | rlb== "3" | rlb == "G" | rlb == "R" |
    

    This would be equivalent to:

    rlb %in% c("1", "2", "3" , "G" , "R")
    

    the vector c("1", "2", "3", "G", "R") could be easily passed as a variable, without any addittional effort involving quosures or non-standard evaluation. I would start from simplifying filters then use simplified expressions via rlang features.


    Code sharing

    Following the comment on code sharing, it may be good to look at the sqldf package:

    require(sqldf)
    sqldf(x = "SELECT * FROM mtcars WHERE CYL = 4")
    

    This is would let you share your filters in SQL, which is usually more familiar then dplyr syntax.

    0 讨论(0)
提交回复
热议问题