How do I compute weighted average using summarise_each?

后端 未结 1 1726
伪装坚强ぢ
伪装坚强ぢ 2020-12-30 15:46

How can I compute the weighted average of all the fields in a dataset using summarise_each in dplyr? For example, let\'s say we want to group the mtcars dat

相关标签:
1条回答
  • 2020-12-30 16:03

    To help see what's going on here. lets make a little function that returns the lengths of it's arguments

    lenxy <- function(x,y)
        paste0(length(x),'-',length(y))
    

    and then apply it in summarise_each, as in:

    mtcars %>% group_by(cyl) %>% summarise_each(funs(lenxy(., qsec)))
    
    #>   cyl   mpg  disp    hp  drat    wt  qsec   vs   am gear carb
    #> 1   4 11-11 11-11 11-11 11-11 11-11 11-11 11-1 11-1 11-1 11-1
    #> 2   6   7-7   7-7   7-7   7-7   7-7   7-7  7-1  7-1  7-1  7-1
    #> 3   8 14-14 14-14 14-14 14-14 14-14 14-14 14-1 14-1 14-1 14-1
    

    Looking at this table, you can see that the lengths of the first and second arguments are the same up until qseq and then afterword the second argument to lenxy has length 1, which is the result of the fact that dplyr does operates on the data in place, replacing each field with it's summary, rather than creating a new data.fame.

    The solution is easy: exclude the weighting variable from the summary:

    mtcars %>% 
        group_by(cyl) %>% 
        summarise_each(funs(weighted.mean(., gear)),
                       -gear)
    
    0 讨论(0)
提交回复
热议问题