Succinct way to summarize different columns with different functions

后端未结

关注

 4  1304

既然无缘 2021-01-13 15:21

My question builds on a similar one by imposing an additional constraint that the name of each variable should appear only once.

Consider a data frame

4条回答

醉梦人生 (楼主)

2021-01-13 16:03

Use .[[i]] and !!names(.)[i]:= to refer to the ith column and its name.

library(tibble)
library(dplyr)
library(rlang)

df %>% summarize(!!names(.)[1] := mean(.[[1]]), !!names(.)[2] := sum(.[[2]]))

giving:

# A tibble: 1 x 2
  potentially_long_name_i_dont_want_to_type_twice another_annoyingly_long_name
                                                                    
1                                             5.5                          255

Update

If df were grouped (it is not in the question so this is not needed) then surround summarize with a do like this:

library(dplyr)
library(rlang)
library(tibble)

df2 <- tibble(a = 1:10, b = 11:20, g = rep(1:2, each = 5))

df2 %>%
  group_by(g) %>%
  do(summarize(., !!names(.)[1] := mean(.[[1]]), !!names(.)[2] := sum(.[[2]]))) %>%
  ungroup

giving:

# A tibble: 2 x 3
      g     a     b
    
1     1     3    65
2     2     8    90

0 讨论(0)

查看其它4个回答