dplyr summarise when function return is vector-valued?

前端 未结 2 1973
误落风尘
误落风尘 2021-02-12 09:27

The dplyr::summarize() function can apply arbitrary functions over the data, but it seems that function must return a scalar value. I\'m curious if there is a reas

相关标签:
2条回答
  • 2021-02-12 09:53

    You could try do

    library(dplyr)
     df %>%
        group_by(group) %>%
        do(setNames(data.frame(t(f(.$x, .$y))), letters[1:2]))
     # group         a           b
     #1     A 0.8983217 -0.04108092
     #2     B 0.8945354  0.44905220
     #3     C 1.2244023 -1.00715248
    

    The output based on f1 and f2 are

    df %>% 
      group_by(group) %>%
      summarise(a = f1(x,y), b = f2(x,y))
    #  group         a           b
    #1     A 0.8983217 -0.04108092
    #2     B 0.8945354  0.44905220
    #3     C 1.2244023 -1.00715248
    

    Update

    If you are using data.table, the option to get similar result is

     library(data.table)
     setnames(setDT(df)[, as.list(f(x,y)) , group], 2:3, c('a', 'b'))[]
    
    0 讨论(0)
  • 2021-02-12 10:03

    This is why I still love plyr::ddply():

    library(plyr)
    f <- function(z) setNames(coef(lm(x ~ y, z)), c("a", "b"))
    ddply(df, ~ group, f)
    #   group           a          b
    # 1     A   0.5213133 0.04624656
    # 2     B   0.3020656 0.01450137
    # 3     C   0.2189537 0.22998823
    
    0 讨论(0)
提交回复
热议问题