I have a variable number of ArrayList\'s that I need to find the intersection of. A realistic cap on the number of sets of strings is probably around 35 but could be more. I
The accepted answer is just fine; as an update : since Java 8 there is a slightly more efficient way to find the intersection of two Set
s.
Set<String> intersection = set1.stream()
.filter(set2::contains)
.collect(Collectors.toSet());
The reason it is slightly more efficient is because the original approach had to add elements of set1
it then had to remove again if they weren't in set2
. This approach only adds to the result set what needs to be in there.
Strictly speaking you could do this pre Java 8 as well, but without Stream
s the code would have been quite a bit more laborious.
If both sets differ considerably in size, you would prefer streaming over the smaller one.
Set.retainAll() is how you find the intersection of two sets. If you use HashSet
, then converting your ArrayList
s to Set
s and using retainAll()
in a loop over all of them is actually O(n).
One more idea - if your arrays/sets are different sizes, it makes sense to begin with the smallest.
Sort them (n lg n) and then do binary searches (lg n).
There is also the static method Sets.intersection(set1, set2) in Google Guava that returns an unmodifiable view of the intersection of two sets.
You can use single HashSet. It's add() method returns false when the object is alredy in set. adding objects from the lists and marking counts of false return values will give you union in the set + data for histogram (and the objects that have count+1 equal to list count are your intersection). If you throw the counts to TreeSet, you can detect empty intersection early.