How to feed a list of unquoted column names into `lapply` (so that I can use it with a `dplyr` function)

前端 未结 3 1806
陌清茗
陌清茗 2021-02-08 14:33

I am trying to write a function in tidyverse/dplyr that I want to eventually use with lapply (or map). (I had been working on it to answer

3条回答
  •  别跟我提以往
    2021-02-08 14:57

    as.name will convert a string to a name and that can be passed to report:

    lapply(cat.list, function(x) do.call("report", list(as.name(x))))
    

    character argument An alternative is to rewrite report so that it accepts a character string argument:

    report_ch <- function(colname) {  
        report_cat <- rlang::sym(colname)   # as.name(colname) would also work here
        sample_data %>%
                    group_by(!!report_cat, YEAR) %>%
                    summarize(num = n(), total = sum(AMOUNT)) %>% 
                    rename(REPORT_VALUE = !!report_cat) %>% 
                    mutate(REPORT_CATEGORY = colname)
    }
    
    lapply(cat.list, report_ch)
    

    wrapr An alternate approach is to rewrite report using the wrapr package which is an alternative to rlang/tidyeval:

    library(dplyr)
    library(wrapr)
    
    report_wrapr <- function(colname) 
      let(c(COLNAME = colname),
          sample_data %>%
                      group_by(COLNAME, YEAR) %>%
                      summarize(num = n(), total = sum(AMOUNT)) %>%
                      rename(REPORT_VALUE = COLNAME) %>%
                      mutate(REPORT_CATEGORY = colname)
       )
    
    lapply(cat.list, report_wrapr)
    

    Of course, this whole problem would go away if you used a different framework, e.g.

    plyr

    library(plyr)
    
    report_plyr <- function(colname)
      ddply(sample_data, c(REPORT_VALUE = colname, "YEAR"), function(x)
         data.frame(num = nrow(x), total = sum(x$AMOUNT), REPORT_CATEOGRY = colname))
    
    lapply(cat.list, report_plyr)
    

    sqldf

    library(sqldf)
    
    report_sql <- function(colname, envir = parent.frame(), ...)
      fn$sqldf("select [$colname] REPORT_VALUE,
                       YEAR,
                       count(*) num,
                       sum(AMOUNT) total,
                       '$colname' REPORT_CATEGORY
                from sample_data
                group by [$colname], YEAR", envir = envir, ...)
    
    lapply(cat.list, report_sql)              
    

    base - by

    report_base_by <- function(colname)
          do.call("rbind", 
            by(sample_data, sample_data[c(colname, "YEAR")], function(x)
                data.frame(REPORT_VALUE = x[1, colname], 
                           YEAR = x$YEAR[1], 
                           num = nrow(x), 
                           total = sum(x$AMOUNT), 
                           REPORT_CATEGORY = colname)
             )
          )
    
    lapply(cat.list, report_base_by)
    

    data.table The data.table package provides another alternative but that has already been covered by another answer.

    Update: Added additional alternatives.

提交回复
热议问题