Weighted mean with by function

后端 未结 3 672
情歌与酒
情歌与酒 2021-01-21 21:45

Trying to get weighted mean for a couple of categories want to use by(df$A,df$B,function(x) weighted.mean(x,df$C)) This doesn\'t work of course. Is there a way to do this using

相关标签:
3条回答
  • 2021-01-21 22:04

    You need to pass the weights along with the values to be averaged in by():

    by(df[c("A","C")], df$B, function(x) weighted.mean(x$A, x$C))
    # df$B: gb
    # [1] 4
    # ------------------------------------------------------------ 
    # df$B: hi
    # [1] 25.44444
    # ------------------------------------------------------------ 
    # df$B: yo
    # [1] 3
    
    0 讨论(0)
  • 2021-01-21 22:06

    Here's a simple and efficient solution using data.table

    library(data.table)
    setDT(df)[, .(WM = weighted.mean(A, C)), B]
    #     B       WM
    # 1: hi 25.44444
    # 2: gb  4.00000
    # 3: yo  3.00000
    

    Or using split and apply combination from base R

    sapply(split(df, df$B), function(x) weighted.mean(x$A, x$C))
    #      gb       hi       yo 
    # 4.00000 25.44444  3.00000 
    

    Or

    library(dplyr)
    df %>%
      group_by(B) %>%
      summarise(WM = weighted.mean(A, C))
    # Source: local data frame [3 x 2]
    # 
    # B       WM
    # 1 gb  4.00000
    # 2 hi 25.44444
    # 3 yo  3.00000
    
    0 讨论(0)
  • 2021-01-21 22:20

    Or simply recreate the calculation used by weighted.mean():

    by(df,df$B,function(df)with(df,sum(A*C)/sum(C)))
    
    df$B: gb
    [1] 4
    ------------------------------------------------------------ 
    df$B: hi
    [1] 25.44444
    ------------------------------------------------------------ 
    df$B: yo
    [1] 3
    
    0 讨论(0)
提交回复
热议问题