compute means of a group by factor

前端 未结 4 598
太阳男子
太阳男子 2021-01-02 06:49

Is there a way that this can be improved, or done more simply?

means.by<-function(data,INDEX){
  b<-by(data,INDEX,function(d)apply(d,2,mean))
  return(         


        
相关标签:
4条回答
  • 2021-01-02 07:24

    You want tapply or ave, depending on how you want your output:

    > Data <- data.frame(grp=sample(letters[1:3],20,TRUE),x=rnorm(20))
    > ave(Data$x, Data$grp)
     [1] -0.3258590 -0.5009832 -0.5009832 -0.2136670 -0.3258590 -0.5009832
     [7] -0.3258590 -0.2136670 -0.3258590 -0.2136670 -0.3258590 -0.3258590
    [13] -0.3258590 -0.5009832 -0.2136670 -0.5009832 -0.3258590 -0.2136670
    [19] -0.5009832 -0.2136670
    > tapply(Data$x, Data$grp, mean)
             a          b          c 
    -0.5009832 -0.2136670 -0.3258590 
    
    # Example with more than one column:
    > Data <- data.frame(grp=sample(letters[1:3],20,TRUE),x=rnorm(20),y=runif(20))
    > do.call(rbind,lapply(split(Data[,-1], Data[,1]), mean))
                 x         y
    a -0.675195494 0.4772696
    b  0.270891403 0.5091359
    c  0.002756666 0.4053922
    
    0 讨论(0)
  • 2021-01-02 07:25

    Use only the generic function in R.

    >d=data.frame(type=as.factor(rep(c("A","B","C"),each=3)),
    x=rnorm(9),y=rgamma(9,2,1))
    > d
    type           x         y
    1    A -1.18077326 3.1428680
    2    A -0.91930418 4.4606603
    3    A  0.88345422 1.0979301
    4    B  0.06964133 1.1429911
    5    B -1.15380345 2.7609049
    6    B  1.13637202 0.6668986
    7    C -1.12052765 1.7352306
    8    C -1.34803630 2.3099202
    9    C -2.23135374 0.7244689
    >
    > cbind(lm(x~-1+type,data=d)$coef,lm(y~-1+type,data=d)$coef)
             [,1]     [,2]
    typeA -0.4055411 2.900486
    typeB  0.0174033 1.523598
    typeC -1.5666392 1.589873
    
    0 讨论(0)
  • 2021-01-02 07:33

    Does the aggregate function do what you want?

    If not, look at the plyr package, it gives several options for taking things apart, doing computations on the pieces, then putting it back together again.

    You may also be able to do this using the reshape package.

    0 讨论(0)
  • 2021-01-02 07:34

    With plyr

    library(plyr)
    df <- ddply(x, .(id),function(x) data.frame(
    mean=mean(x$var)
    ))
    print(df)
    

    Update:

    data<-data.frame(I=as.factor(rep(letters[1:10],each=3)),x=rnorm(30),y=rbinom(30,5,.5))
    ddply(data,.(I), function(x) data.frame(x=mean(x$x), y=mean(x$y)))
    

    See, plyr is smart :)

    Update 2:

    In response to your comment, I believe cast and melt from the reshape package are much simpler for your purpose.

    cast(melt(data),I ~ variable, mean)
    
    0 讨论(0)
提交回复
热议问题