问题
I have 4 lists
a <- list(1,2,3,4)
b <- list(5,6,7,8)
c <- list(7,9,0)
d <- list(12,14)
I would like to know which of the lists have elements in common. In this example, lists b
and c
have the element 7 in common.
A brute force approach would be to take every combination of lists and find the intersection. Is there any other efficient way to do it in R?
Another approach would be to make a single list from all the lists and find the duplicates. Then maybe we could have a mapping function to indicate from which original lists these duplicates are from. But am not so sure about how to do it. I came across this post
Find indices of duplicated rows
I was thinking if we could modify this to find out the actual lists which have duplicates.
I have to repeat this process for many groups of lists. Any suggestions/ideas are greatly appreciated! Thanks in advance
回答1:
What about using this double sapply
?
l <- list(a,b,c,d)
sapply(seq_len(length(l)), function(x)
sapply(seq_len(length(l)), function(y) length(intersect(unlist(l[x]), unlist(l[y])))))
[,1] [,2] [,3] [,4]
[1,] 4 0 0 0
[2,] 0 4 1 0
[3,] 0 1 3 0
[4,] 0 0 0 2
Interpretation: e.g. the element [1,2] of the matrix shows you how many elements the first element of the list l
(in this case the sublist a
) has in commom with the second list element (i.e. the sublist b
)
Or alternatively just to see the indices of the sublists which have a common value with some other sublist:
which(sapply(seq_len(length(l)), function(x) length(intersect(l[[x]], unlist(l[-x])))) >= 1)
[1] 2 3
来源:https://stackoverflow.com/questions/30406560/multiple-intersection-of-lists