问题
I have a data.table with many individuals (with ids) in many groups. Within each group, I would like to find every combination of ids (every pair of individuals). I know how to do this with a split-apply-combine approach, but I am hoping that a data.table would be faster.
Sample data:
dat <- data.table(ids=1:20, groups=sample(x=c("A","B","C"), 20, replace=TRUE))
Split-Apply-Combine Method:
datS <- split(dat, f=dat$groups)
datSc <- lapply(datS, function(x){ as.data.table(t(combn(x$ids, 2)))})
rbindlist(datSc)
head(rbindlist(datSc))
V1 V2
1: 2 5
2: 2 10
3: 2 19
4: 5 10
5: 5 19
6: 10 19
My best data.table attempt produces a single column, not two columns with all the possible combinations:
dat[, combn(x=ids, m=2), by=groups]
Thanks in advance.
回答1:
You need to convert the result from t(combn())
which is a matrix to a data.table
or data.frame
, so this should work:
library(data.table)
set.seed(10)
dat <- data.table(ids=1:20, groups=sample(x=c("A","B","C"), 20, replace=TRUE))
dt <- dat[, as.data.table(t(combn(ids, 2))), .(groups)]
head(dt)
groups V1 V2
1: C 1 3
2: C 1 5
3: C 1 7
4: C 1 10
5: C 1 13
6: C 1 14
回答2:
library(data.table)
dat <- data.table(ids=1:20, groups=sample(x=c("A","B","C"), 20, replace=TRUE))
ind<-unique(dat$groups)
lapply(1:length(ind), function (i) combn(dat$ids[which(dat$groups==ind[i])],2))
You can then change the list to any other type of format you might need.
来源:https://stackoverflow.com/questions/37333996/generate-all-id-pairs-by-group-with-data-table-in-r