dplyr summarise when function return is vector-valued?

前端未结

关注

 2  1973

The dplyr::summarize() function can apply arbitrary functions over the data, but it seems that function must return a scalar value. I\'m curious if there is a reas

相关标签:

2条回答

迷失自我

2021-02-12 09:53

You could try do

library(dplyr)
 df %>%
    group_by(group) %>%
    do(setNames(data.frame(t(f(.$x, .$y))), letters[1:2]))
 # group         a           b
 #1     A 0.8983217 -0.04108092
 #2     B 0.8945354  0.44905220
 #3     C 1.2244023 -1.00715248

The output based on f1 and f2 are

df %>% 
  group_by(group) %>%
  summarise(a = f1(x,y), b = f2(x,y))
#  group         a           b
#1     A 0.8983217 -0.04108092
#2     B 0.8945354  0.44905220
#3     C 1.2244023 -1.00715248

Update

If you are using data.table, the option to get similar result is

 library(data.table)
 setnames(setDT(df)[, as.list(f(x,y)) , group], 2:3, c('a', 'b'))[]

0 讨论(0)

执念已碎

2021-02-12 10:03

This is why I still love plyr::ddply():

library(plyr)
f <- function(z) setNames(coef(lm(x ~ y, z)), c("a", "b"))
ddply(df, ~ group, f)
#   group           a          b
# 1     A   0.5213133 0.04624656
# 2     B   0.3020656 0.01450137
# 3     C   0.2189537 0.22998823

0 讨论(0)