With a 2-column data.table
, I\'d like to summarize the pairwise relationships in column 1 by summing the number of shared elements in column 2. In other words,
If you can split your Y
's into groups that don't have a large intersection of X
's, you could do the computation by those groups first, resulting in a smaller intermediate table:
d[, grp := Y <= 3] # this particular split works best for OP data
d[, .SD[.SD, allow = T][, .N, by = .(X, i.X)], by = grp][,
.(N = sum(N)), by = .(X, i.X)]
The intermediate table above has only 16 rows, as opposed to 26. Unfortunately I can't think of an easy way to create such grouping automatically.