Different functions over a list of columns and generate new column names automatically with data.table

后端 未结 2 1565
遥遥无期
遥遥无期 2021-01-22 00:45

I have a section in my Shiny app that generates a list.

names of the list are column names of the dataframe we will calculate on, list items contain the ca

2条回答
  •  清酒与你
    2021-01-22 01:16

    If I understand correctly, the question is not about shiny in first place but about how to apply different aggregation functions to specific columns of a data.table.

    The names of the columns and the functions which are to be applied on are given as list mylist which is created by the shiny app.

    Among the various approaches my preferred option is to compute on the language, i.e., to create a complete expression from the contents of mylist and to evaluate it:

    library(magrittr)
    library(data.table)
    mylist %>%
      names() %>% 
      lapply(
        function(.col) lapply(
          mylist[[.col]], 
          function(.fct) sprintf("%s.%s = %s(%s)", .col, .fct, .fct, .col))) %>% 
      unlist() %>% 
      paste(collapse = ", ") %>% 
      sprintf("as.data.table(mtcars)[, .(%s), by = cyl]", .) %>% 
      parse(text = .) %>% 
      eval()
    

    which yields the expected result

       cyl disp.sum disp.mean    hp.sd drat.sum drat.mean wt.max
    1:   6   1283.2  183.3143 24.26049    25.10  3.585714  3.460
    2:   4   1156.5  105.1364 20.93453    44.78  4.070909  3.190
    3:   8   4943.4  353.1000 50.97689    45.21  3.229286  5.424
    

    The character string which is parsed is created by

    mylist %>%
      names() %>% 
      lapply(
        function(.col) lapply(
          mylist[[.col]], 
          function(.fct) sprintf("%s.%s = %s(%s)", .col, .fct, .fct, .col))) %>% 
      unlist() %>% 
      paste(collapse = ", ") %>% 
      sprintf("as.data.table(mtcars)[, .(%s), by = cyl]", .)
    

    and looks as if coded manually:

    [1] "as.data.table(mtcars)[, .(disp.sum = sum(disp), disp.mean = mean(disp), hp.sd = sd(hp), drat.sum = sum(drat), drat.mean = mean(drat), wt.max = max(wt)), by = cyl]"
    

    Data

    For demonstration, mylist is provided "hard-coded":

    mylist <- list(
      disp = c("sum", "mean"),
      hp = "sd",
      drat = c("sum", "mean"),
      wt = "max")
    

提交回复
热议问题