I have asked the same question a few days ago ( click here), but didn\'t mention that a result using data.table
would be appreciated
The \"aggregate-soluti
Going to wide format is a little awkward currently in data.table
, but I think this works:
library(data.table)
dt = data.table(x=c("p1","p1","p2"),y=c("a","b","a"),z=c(14,14,16))
setkey(dt, x, y)
dt[CJ(unique(x), unique(y)), list(.N, z)][,
setNames(as.list(c(N, z[!is.na(z)][1])), c(y, 'z')), by = x]
# x a b z
#1: p1 1 1 14
#2: p2 1 0 16
The CJ
part joins by all combinations of unique x
and y
, and then in that join there is a hidden by-without-by that's used to compute counts via .N
. Once you have those it's just a matter of putting them horizontally for each x
together with any non-NA z
(I chose the first) and that's accomplished using as.list
. Finally setNames
sets the column names correctly.