I have a data.table out
like this (in reality it is much larger):
out <- code weights group
1: 2 0.387 1
2:
Using CJ
(cross join) you can add the missing combinations:
library(data.table)
setkey(out, code, group)
out[CJ(code, group, unique = TRUE)
][, lapply(.SD, sum), by = .(code, group)
][is.na(weights), weights := 0]
gives:
code group weights 1: 1 1 0.399 2: 1 2 0.212 3: 1 3 0.474 4: 2 1 1.997 5: 2 2 0.373 6: 2 3 0.569 7: 3 1 0.000 8: 3 2 1.323 9: 3 3 0.316
Or with xtabs
as @alexis_laz showed in the comments:
xtabs(weights ~ group + code, out)
which gives:
code group 1 2 3 1 0.399 1.997 0.000 2 0.212 0.373 1.323 3 0.474 0.569 0.316
If you want to get this output in a long-form dataframe, you can wrap the xtabs
code in the melt
function of the reshape2 (or data.table) package:
library(reshape2)
res <- melt(xtabs(weights ~ group + code, out))
which gives:
> class(res) [1] "data.frame" > res group code value 1 1 1 0.399 2 2 1 0.212 3 3 1 0.474 4 1 2 1.997 5 2 2 0.373 6 3 2 0.569 7 1 3 0.000 8 2 3 1.323 9 3 3 0.316
You could also do this with a combination of dplyr and tidyr:
library(dplyr)
library(tidyr)
out %>%
complete(code, group, fill = list(weights=0)) %>%
group_by(code, group) %>%
summarise(sum(weights))
I had a similar problem, and CJ
did not work for some reason. A relatively simple solution I ended up using is first calling dcast
and then melt
(similar to the xtable
solution above)- this also conveniently lets you specify the fill value for the missing combinations.
sum.dt <- dcast(out, code ~ group, value.var = 'weights',
fun.aggregate = sum, fill = 0)
sum.dt <- melt(sum.dt, id.vars = 'code', variable.name = 'group')
This gives
> sum.dt
code group value
1: 1 1 0.399
2: 2 1 1.997
3: 3 1 0.000
4: 1 2 0.212
5: 2 2 0.373
6: 3 2 1.322
7: 1 3 0.474
8: 2 3 0.569
9: 3 3 0.316