Trying to get weighted mean for a couple of categories want to use by(df$A,df$B,function(x) weighted.mean(x,df$C)) This doesn\'t work of course. Is there a way to do this using
You need to pass the weights along with the values to be averaged in by()
:
by(df[c("A","C")], df$B, function(x) weighted.mean(x$A, x$C))
# df$B: gb
# [1] 4
# ------------------------------------------------------------
# df$B: hi
# [1] 25.44444
# ------------------------------------------------------------
# df$B: yo
# [1] 3
Here's a simple and efficient solution using data.table
library(data.table)
setDT(df)[, .(WM = weighted.mean(A, C)), B]
# B WM
# 1: hi 25.44444
# 2: gb 4.00000
# 3: yo 3.00000
Or using split
and apply
combination from base R
sapply(split(df, df$B), function(x) weighted.mean(x$A, x$C))
# gb hi yo
# 4.00000 25.44444 3.00000
Or
library(dplyr)
df %>%
group_by(B) %>%
summarise(WM = weighted.mean(A, C))
# Source: local data frame [3 x 2]
#
# B WM
# 1 gb 4.00000
# 2 hi 25.44444
# 3 yo 3.00000
Or simply recreate the calculation used by weighted.mean()
:
by(df,df$B,function(df)with(df,sum(A*C)/sum(C)))
df$B: gb
[1] 4
------------------------------------------------------------
df$B: hi
[1] 25.44444
------------------------------------------------------------
df$B: yo
[1] 3