I can achieve this task, but I feel like there must be a \"best\" (slickest, most compact, clearest-code, fastest?) way of doing it and have not figured it out so far ...
I find the doBy package has some very convenient functions for things like this. For example, the function ?summaryBy is quite handy. Consider:
> summaryBy(y~f1+f2+f3, data=d, FUN=c(mean, var))
f1 f2 f3 y.mean y.var
1 A a I 0.6502307 0.095379578
2 A a II 0.4876630 0.110796695
3 A a III 0.3102926 0.202805677
4 A b I 0.3914084 0.058693103
5 A b II 0.5257355 0.218631264
6 A b III 0.3356860 0.079433136
7 A c I 0.3367841 0.079487973
8 A c II 0.6273320 0.041373836
9 A c III 0.4532720 0.022779672
10 B a I 0.6688221 0.044184575
11 B a II 0.5514724 0.020359289
12 B a III 0.6389354 0.104056229
13 B b I 0.5052346 0.138379070
14 B b II 0.3933283 0.050261804
15 B b III 0.5953874 0.161943989
16 B c I 0.3490460 0.079286849
17 B c II 0.5534569 0.207381592
18 B c III 0.4652424 0.187463143
19 C a I 0.3340988 0.004994589
20 C a II 0.3970315 0.126967554
21 C a III 0.3580250 0.066769484
22 C b I 0.7676858 0.124945402
23 C b II 0.3613772 0.182689385
24 C b III 0.4175562 0.095933470
25 C c I 0.3592491 0.039832864
26 C c II 0.7882591 0.084271963
27 C c III 0.3936949 0.085758343
So the function call is simple, easy to use, and I would say, elegant.
Now, if your primary concern is speed, it seems that it would be reasonable--at least with smaller sized tasks (note that I couldn't get the ramnath_datatable
function to work for whatever reason):
test replications elapsed relative user.self
4 dwin_hmisc(d) 100 0.50 2.778 0.50
3 formula_aggregate(d) 100 0.23 1.278 0.24
5 gung_summaryBy(d) 100 0.34 1.889 0.35
1 joran_ddply(d) 100 1.34 7.444 1.32
2 joshulrich_aggregate(d) 100 0.18 1.000 0.19