I can achieve this task, but I feel like there must be a \"best\" (slickest, most compact, clearest-code, fastest?) way of doing it and have not figured it out so far ...
(I voted for Joshua's.) Here's an Hmisc::summary.formula solution. The advantage of this for me is that it is well integrated with the Hmisc::latex output "channel".
summary(y ~ interaction(f3,f2,f1), data=d, method="response",
fun=function(y) c(mean.y=mean(y) ,var.y=var(y) ))
#-----output----------
y N=108
+-----------------------+-------+---+---------+-----------+
| | |N |mean.y |var.y |
+-----------------------+-------+---+---------+-----------+
|interaction(f3, f2, f1)|I.a.A | 4|0.6502307|0.095379578|
| |II.a.A | 4|0.4876630|0.110796695|
snipped output to show the latex -> PDF -> png output:
I'm slightly addicted to speed comparisons even though they're largely irrelevant for me in this situation ...
joran_ddply <- function(d) ddply(d,.(f1,f2,f3),
summarise,y.mean = mean(y),y.var = var(y))
joshulrich_aggregate <- function(d) {
aggregate(d$y, d[,c("f1","f2","f3")],
FUN=function(x) c(mean=mean(x),var=var(x)))
}
formula_aggregate <- function(d) {
aggregate(y~f1*f2*f3,data=d,
FUN=function(x) c(mean=mean(x),var=var(x)))
}
library(data.table)
d2 <- data.table(d)
ramnath_datatable <- function(d) {
d[,list(avg_y = mean(y), var_y = var(y)), 'f1, f2, f3']
}
library(Hmisc)
dwin_hmisc <- function(d) {summary(y ~ interaction(f3,f2,f1),
data=d, method="response",
fun=function(y) c(mean.y=mean(y) ,var.y=var(y) ))
}
library(rbenchmark)
benchmark(joran_ddply(d),
joshulrich_aggregate(d),
ramnath_datatable(d2),
formula_aggregate(d),
dwin_hmisc(d))
aggregate
is fastest (even faster than data.table
, which is a surprise to me, although things might be different with a bigger table to aggregate), even using the formula interface ...)
test replications elapsed relative user.self sys.self
5 dwin_hmisc(d) 100 1.235 2.125645 1.168 0.044
4 formula_aggregate(d) 100 0.703 1.209983 0.656 0.036
1 joran_ddply(d) 100 3.345 5.757315 3.152 0.144
2 joshulrich_aggregate(d) 100 0.581 1.000000 0.596 0.000
3 ramnath_datatable(d2) 100 0.750 1.290878 0.708 0.000
(Now I just need Dirk to step up and post an Rcpp
solution that is 1000 times faster than anything else ...)