Dplyr difference between select and group_by with respect to quoted variables?

前端 未结 1 1308
心在旅途
心在旅途 2021-02-06 09:28

In the current version of dplyr, select arguments can be passed by value:

variable <- \"Species\"
iris %>% 
    select(variable)

#       Spec         


        
相关标签:
1条回答
  • 2021-02-06 10:01

    To pass string as symbol or unevaluated code, you have to first parse it to symbol or quosure. You can use sym or parse_expr from rlang to parse and later use !! to unquote:

    library(dplyr)
    
    variable <- rlang::sym("Species")
    # variable <- rlang::parse_expr("Species")
    
    iris %>% 
      group_by(!! variable) %>% 
      summarise(Petal.Length = mean(Petal.Length))
    

    !! is a shortcut for UQ(), which unquotes the expression or symbol. This allows variable to be evaluated only within the scope of where it is called, namely, group_by.

    Difference between sym and parse_expr and which one to use when?

    The short answer: it doesn't matter in this case.

    The long answer:

    A symbol is a way to refer to an R object, basically the "name" of an object. So sym is similar to as.name in base R. parse_expr on the other hand transforms some text into R expressions. This is similar to parse in base R.

    Expressions can be any R code, not just code that references R objects. So you can parse the code that references an R object, but you can't turn some random code into sym if the object that it references does not exist.

    In general, you will use sym when your string refers to an object (although parse_expr would also work), and use parse_expr when you are trying to parse any other R code for further evaluation.

    For this particular use case, variable is supposed to be referencing an object, so turning it into a sym would work. On the other hand, parsing it as an expression would also work because that is the code that is going to be evaluated inside group_by when being unquoted by !!.

    0 讨论(0)
提交回复
热议问题