Is there a way that this can be improved, or done more simply?
means.by<-function(data,INDEX){
b<-by(data,INDEX,function(d)apply(d,2,mean))
return(
You want tapply
or ave
, depending on how you want your output:
> Data <- data.frame(grp=sample(letters[1:3],20,TRUE),x=rnorm(20))
> ave(Data$x, Data$grp)
[1] -0.3258590 -0.5009832 -0.5009832 -0.2136670 -0.3258590 -0.5009832
[7] -0.3258590 -0.2136670 -0.3258590 -0.2136670 -0.3258590 -0.3258590
[13] -0.3258590 -0.5009832 -0.2136670 -0.5009832 -0.3258590 -0.2136670
[19] -0.5009832 -0.2136670
> tapply(Data$x, Data$grp, mean)
a b c
-0.5009832 -0.2136670 -0.3258590
# Example with more than one column:
> Data <- data.frame(grp=sample(letters[1:3],20,TRUE),x=rnorm(20),y=runif(20))
> do.call(rbind,lapply(split(Data[,-1], Data[,1]), mean))
x y
a -0.675195494 0.4772696
b 0.270891403 0.5091359
c 0.002756666 0.4053922
Use only the generic function in R.
>d=data.frame(type=as.factor(rep(c("A","B","C"),each=3)),
x=rnorm(9),y=rgamma(9,2,1))
> d
type x y
1 A -1.18077326 3.1428680
2 A -0.91930418 4.4606603
3 A 0.88345422 1.0979301
4 B 0.06964133 1.1429911
5 B -1.15380345 2.7609049
6 B 1.13637202 0.6668986
7 C -1.12052765 1.7352306
8 C -1.34803630 2.3099202
9 C -2.23135374 0.7244689
>
> cbind(lm(x~-1+type,data=d)$coef,lm(y~-1+type,data=d)$coef)
[,1] [,2]
typeA -0.4055411 2.900486
typeB 0.0174033 1.523598
typeC -1.5666392 1.589873
Does the aggregate function do what you want?
If not, look at the plyr package, it gives several options for taking things apart, doing computations on the pieces, then putting it back together again.
You may also be able to do this using the reshape package.
With plyr
library(plyr)
df <- ddply(x, .(id),function(x) data.frame(
mean=mean(x$var)
))
print(df)
Update:
data<-data.frame(I=as.factor(rep(letters[1:10],each=3)),x=rnorm(30),y=rbinom(30,5,.5))
ddply(data,.(I), function(x) data.frame(x=mean(x$x), y=mean(x$y)))
See, plyr
is smart :)
Update 2:
In response to your comment, I believe cast and melt from the reshape package are much simpler for your purpose.
cast(melt(data),I ~ variable, mean)