for each group summarise means for all variables in dataframe (ddply? split?)

前端 未结 6 1055
难免孤独
难免孤独 2020-12-13 16:00

A week ago I would have done this manually: subset dataframe by group to new dataframes. For each dataframe compute means for each variables, then rbind. very clunky ...

6条回答
  •  囚心锁ツ
    2020-12-13 16:40

    EDIT: I wrote the following and then realized that Thierry had already written up almost EXACTLY the same answer. I somehow overlooked his answer. So if you like this answer, vote his up instead. I'm going ahead and posting since I spent the time typing it up.


    This sort of stuff consumes way more of my time than I wish it did! Here's a solution using the reshape package by Hadley Wickham. This example does not do exactly what you asked because the results are all in one big table, not a table for each group.

    The trouble you were having with the numeric values showing up as factors was because you were using cbind and everything was getting slammed into a matrix of type character. The cool thing is you don't need cbind with data.frame.

    test_data <- data.frame(
    var0 = rnorm(100),
    var1 = rnorm(100,1),
    var2 = rnorm(100,2),
    var3 = rnorm(100,3),
    var4 = rnorm(100,4),
    group = sample(letters[1:10],100,replace=T),
    year = sample(c(2007,2009),100, replace=T))
    
    library(reshape)
    molten_data <- melt(test_data, id=c("group", "year")))
    cast(molten_data, group + variable ~ year, mean)
    

    and this results in the following:

        group variable        2007         2009
    1      a     var0 -0.92040686 -0.154746420
    2      a     var1  1.06603832  0.559765035
    3      a     var2  2.34476321  2.206521587
    4      a     var3  3.01652065  3.256580166
    5      a     var4  3.75256699  3.907777127
    6      b     var0 -0.53207427 -0.149144766
    7      b     var1  0.75677714  0.879387608
    8      b     var2  2.41739521  1.224854891
    9      b     var3  2.63877431  2.436837719
    10     b     var4  3.69640598  4.439047363
    ...
    

    I wrote a blog post recently about doing something similar with plyr. I should do a part 2 about how to do the same thing using the reshape package. Both plyr and reshape were written by Hadley Wickham and are crazy useful tools.

提交回复
热议问题