quick/elegant way to construct mean/variance summary table

后端 未结 8 1974
甜味超标
甜味超标 2020-12-13 19:45

I can achieve this task, but I feel like there must be a \"best\" (slickest, most compact, clearest-code, fastest?) way of doing it and have not figured it out so far ...

8条回答
  •  囚心锁ツ
    2020-12-13 20:22

    I find the doBy package has some very convenient functions for things like this. For example, the function ?summaryBy is quite handy. Consider:

    > summaryBy(y~f1+f2+f3, data=d, FUN=c(mean, var))
       f1 f2  f3    y.mean       y.var
    1   A  a   I 0.6502307 0.095379578
    2   A  a  II 0.4876630 0.110796695
    3   A  a III 0.3102926 0.202805677
    4   A  b   I 0.3914084 0.058693103
    5   A  b  II 0.5257355 0.218631264
    6   A  b III 0.3356860 0.079433136
    7   A  c   I 0.3367841 0.079487973
    8   A  c  II 0.6273320 0.041373836
    9   A  c III 0.4532720 0.022779672
    10  B  a   I 0.6688221 0.044184575
    11  B  a  II 0.5514724 0.020359289
    12  B  a III 0.6389354 0.104056229
    13  B  b   I 0.5052346 0.138379070
    14  B  b  II 0.3933283 0.050261804
    15  B  b III 0.5953874 0.161943989
    16  B  c   I 0.3490460 0.079286849
    17  B  c  II 0.5534569 0.207381592
    18  B  c III 0.4652424 0.187463143
    19  C  a   I 0.3340988 0.004994589
    20  C  a  II 0.3970315 0.126967554
    21  C  a III 0.3580250 0.066769484
    22  C  b   I 0.7676858 0.124945402
    23  C  b  II 0.3613772 0.182689385
    24  C  b III 0.4175562 0.095933470
    25  C  c   I 0.3592491 0.039832864
    26  C  c  II 0.7882591 0.084271963
    27  C  c III 0.3936949 0.085758343
    

    So the function call is simple, easy to use, and I would say, elegant.

    Now, if your primary concern is speed, it seems that it would be reasonable--at least with smaller sized tasks (note that I couldn't get the ramnath_datatable function to work for whatever reason):

                         test replications elapsed relative user.self 
    4           dwin_hmisc(d)          100    0.50    2.778      0.50 
    3    formula_aggregate(d)          100    0.23    1.278      0.24 
    5       gung_summaryBy(d)          100    0.34    1.889      0.35 
    1          joran_ddply(d)          100    1.34    7.444      1.32 
    2 joshulrich_aggregate(d)          100    0.18    1.000      0.19 
    

提交回复
热议问题