Moving from deprecated summarize_ to new summarize in dplyr

后端 未结 2 1829
挽巷
挽巷 2021-01-23 23:29

I have a function that calculates the means of a grouped database for a column which is chosen based on the content of a variable VarName. The current function uses

相关标签:
2条回答
  • 2021-01-24 00:31

    Turning this into a function and generalizing for arbitrary data, grouping variable, and value variable:

    library(tidyverse)
    
    means <- function(data, group, value) {
    
      group = enquo(group)
      value = enquo(value)
      value_name = paste0("mean_", value)[2]
    
      data %>% group_by(!!group) %>% 
        summarise(!!value_name := mean(!!value, na.rm=TRUE))
    }
    
    means(dat, Grade, Things)
    
      Grade mean_Things
      <dbl>       <dbl>
    1  2.00        90.0
    2  3.00        77.5
    

    If I understand your comment, how about the function below, which takes a string for the value argument:

    means <- function(data, group, value) {
    
      group = enquo(group)
      value_name = paste0("mean_", value)
      value = sym(value)
    
      data %>% group_by(!!group) %>% 
        summarise(!!value_name := mean(!!value, na.rm=TRUE))
    }
    
    VarName = "Things"
    
    means(dat, Grade, VarName)
    
      Grade mean_Things
      <dbl>       <dbl>
    1  2.00        90.0
    2  3.00        77.5
    

    Since the function is generalized, you can do this with any data frame. For example:

    means(mtcars, cyl, "mpg")
    
        cyl mean_mpg
      <dbl>    <dbl>
    1  4.00     26.7
    2  6.00     19.7
    3  8.00     15.1
    

    You can generalize the function still further. For example, this version takes an arbitrary number of grouping columns:

    means <- function(data, value, ...) {
    
      group = quos(...)
      value_name = paste0("mean_", value)
      value = sym(value)
    
      data %>% group_by(!!!group) %>% 
        summarise(!!value_name := mean(!!value, na.rm=TRUE))
    }
    
    VarName = "Things"
    
    means(dat, VarName, students, Grade)
    
      students Grade mean_Things
      <fct>    <dbl>       <dbl>
    1 a         2.00        90.0
    2 b         2.00       100  
    3 c         2.00        80.0
    4 d         3.00        75.0
    5 e         3.00        80.0
    
    0 讨论(0)
  • 2021-01-24 00:33

    Use !! with as.name or as.symbol:

    dat %>% 
        group_by(Grade) %>% 
        summarize(means = mean(!!as.name(VarName), na.rm=T))
        # or summarize(means = mean(!!as.symbol(VarName), na.rm=T))
    
    # A tibble: 2 x 2
    #  Grade means
    #  <dbl> <dbl>
    #1  2.00  90.0
    #2  3.00  77.5
    
    0 讨论(0)
提交回复
热议问题