Succinct way to summarize different columns with different functions

后端 未结 4 1294
既然无缘
既然无缘 2021-01-13 15:21

My question builds on a similar one by imposing an additional constraint that the name of each variable should appear only once.

Consider a data frame



        
4条回答
  •  醉梦人生
    2021-01-13 16:03

    Use .[[i]] and !!names(.)[i]:= to refer to the ith column and its name.

    library(tibble)
    library(dplyr)
    library(rlang)
    
    df %>% summarize(!!names(.)[1] := mean(.[[1]]), !!names(.)[2] := sum(.[[2]])) 
    

    giving:

    # A tibble: 1 x 2
      potentially_long_name_i_dont_want_to_type_twice another_annoyingly_long_name
                                                                        
    1                                             5.5                          255
    

    Update

    If df were grouped (it is not in the question so this is not needed) then surround summarize with a do like this:

    library(dplyr)
    library(rlang)
    library(tibble)
    
    df2 <- tibble(a = 1:10, b = 11:20, g = rep(1:2, each = 5))
    
    df2 %>%
      group_by(g) %>%
      do(summarize(., !!names(.)[1] := mean(.[[1]]), !!names(.)[2] := sum(.[[2]]))) %>%
      ungroup
    

    giving:

    # A tibble: 2 x 3
          g     a     b
        
    1     1     3    65
    2     2     8    90
    

提交回复
热议问题