I am looking for a solution to compute weighted sum of some variables by groups with data.table. I hope the example is clear enough.
require(data.table)
dt
Copying @Roland's excellent answer:
print(dt[, lapply(.SD, function(x, w) sum(x*w), w=w), by=gr][, w := NULL])
Following @Roland's comment, it's indeed faster to do the operation on all columns and then just remove the unwanted ones (as long as the operation itself is not time consuming, which is the case here).
dt[, {lapply(.SD, function(x) sum(x*w))}, by=gr][, w := NULL][]
For some reason, w
seems to be not found when I don't use {}
.. No idea why though.
(Subsetting can be costly if there are too many groups)
You can do this without using .SDcols
and then removing it while providing it to lapply
as follows:
dt[, lapply(.SD[, -1, with=FALSE], function(x) sum(x*w)), by=gr]
# gr V1 V2 V3 V4
# 1: 1 20 120 220 320
# 2: 2 70 170 270 370
.SDcols
makes .SD
without the w
column. So, it's not possible to multiply with w
as it doesn't exist within the scope of .SD environment then.