问题
My question is, how can we apply intersection for 5~7 sets. Suppose each set having set of elements. please help me in creating an algorithm for this and what will be the complexity of this process.
回答1:
A straight forward method:
I = S_1;
For each set s in S_2 ... S_N:
For each element ei in I:
if ei not in s
remove ei from I
endif
endfor
endfor
With this the complexity is m^2xN if each set has m elements and there are N sets. If sets are sorted then you can have mlog(m)N with binary search or even O(mN) by having two iterators advancing in the sorted case.
回答2:
Assuming that the elements of the sets can be hashed, and that you have some Hash-Key facility like dictionaries (or can create your own, which is not hard):
List<Set<element-type>> sets; \\your list of sets to intersect
int size = SUM{List[*].Count}; \\ size for the hash
Dictionary<element-type,int> Tally = New Dictionary<element-type,int>(size);
// Add all elements to the Tally hash
foreach set in sets
{
foreach e in set
{
if (Tally.Exists(e))
Tally[e]++;
else
Tally.Add(e,1);
}
}
//Now, find the Tally entries that match the number of sets
foreach kvp in Tally.KeyValuePairs
{
If (kvp.Value == sets.Count)
// add the Key to output list/set
Output.Add(kvp.Key);
}
This has run-time complexity O(n) Where "n" is the number of elements in all sets.
回答3:
I'll assume for the moment that the sets are represented as lists and that they start out unsorted.
(Edited to conform my symbols to those of @perreal)
Given a total of m*N items in the N sets, one could concatenate the sets into a single list (m*N operations), sort the list (m*N log m*N operations) and then run through the sorted list, retaining any item in the list that has exactly N copies (another m*N operations), giving a total (I think) of m*N (2 + log m*N) operations for any case.
By comparison, assuming each set has the same number of items m, I think @perreal's solution would be a maximum of m^2*N operations if the sets were all identical. That would require more than my algorithm's m*N (2 + log m*N) operations for large values of m*N. However, in the best case, @perreal's solution would require as few as 2m*N operations (if the first and second sets tested had no intersection).
@perreal's solution would also require fewer operations for cases where the intersection was small if the sets were compared in increasing order of size, with S_1 being the smallest set.
If the sets started out as sorted lists, both solutions would be faster because there would be no need for the initial sort for my algorithm and @perreal's algorithm could decide that an element was not in the set without having to search through the entire set.
来源:https://stackoverflow.com/questions/15264544/algorithm-model-for-intersection-of-several-sets