Apply several summary functions on several variables by group in one call

前端 未结 7 1467
一个人的身影
一个人的身影 2020-11-22 00:03

I have the following data frame

x <- read.table(text = \"  id1 id2 val1 val2
1   a   x    1    9
2   a   x    2    4
3   a   y    3    5
4   a   y    4            


        
7条回答
  •  说谎
    说谎 (楼主)
    2020-11-22 00:30

    You can do it all in one step and get proper labeling:

    > aggregate(. ~ id1+id2, data = x, FUN = function(x) c(mn = mean(x), n = length(x) ) )
    #   id1 id2 val1.mn val1.n val2.mn val2.n
    # 1   a   x     1.5    2.0     6.5    2.0
    # 2   b   x     2.0    2.0     8.0    2.0
    # 3   a   y     3.5    2.0     7.0    2.0
    # 4   b   y     3.0    2.0     6.0    2.0
    

    This creates a dataframe with two id columns and two matrix columns:

    str( aggregate(. ~ id1+id2, data = x, FUN = function(x) c(mn = mean(x), n = length(x) ) ) )
    'data.frame':   4 obs. of  4 variables:
     $ id1 : Factor w/ 2 levels "a","b": 1 2 1 2
     $ id2 : Factor w/ 2 levels "x","y": 1 1 2 2
     $ val1: num [1:4, 1:2] 1.5 2 3.5 3 2 2 2 2
      ..- attr(*, "dimnames")=List of 2
      .. ..$ : NULL
      .. ..$ : chr  "mn" "n"
     $ val2: num [1:4, 1:2] 6.5 8 7 6 2 2 2 2
      ..- attr(*, "dimnames")=List of 2
      .. ..$ : NULL
      .. ..$ : chr  "mn" "n"
    

    As pointed out by @lord.garbage below, this can be converted to a dataframe with "simple" columns by using do.call(data.frame, ...)

    str( do.call(data.frame, aggregate(. ~ id1+id2, data = x, FUN = function(x) c(mn = mean(x), n = length(x) ) ) ) 
        )
    'data.frame':   4 obs. of  6 variables:
     $ id1    : Factor w/ 2 levels "a","b": 1 2 1 2
     $ id2    : Factor w/ 2 levels "x","y": 1 1 2 2
     $ val1.mn: num  1.5 2 3.5 3
     $ val1.n : num  2 2 2 2
     $ val2.mn: num  6.5 8 7 6
     $ val2.n : num  2 2 2 2
    

    This is the syntax for multiple variables on the LHS:

    aggregate(cbind(val1, val2) ~ id1 + id2, data = x, FUN = function(x) c(mn = mean(x), n = length(x) ) )
    

提交回复
热议问题