Algorithm model for Intersection of several sets

Deadly 提交于 2020-01-13 18:06:27

问题


My question is, how can we apply intersection for 5~7 sets. Suppose each set having set of elements. please help me in creating an algorithm for this and what will be the complexity of this process.


回答1:


A straight forward method:

I = S_1;
For each set s in S_2 ... S_N:
    For each element ei in I:
      if ei not in s
        remove ei from I
      endif
    endfor
endfor

With this the complexity is m^2xN if each set has m elements and there are N sets. If sets are sorted then you can have mlog(m)N with binary search or even O(mN) by having two iterators advancing in the sorted case.




回答2:


Assuming that the elements of the sets can be hashed, and that you have some Hash-Key facility like dictionaries (or can create your own, which is not hard):

List<Set<element-type>> sets;    \\your list of sets to intersect

int size = SUM{List[*].Count};  \\ size for the hash
Dictionary<element-type,int> Tally = New Dictionary<element-type,int>(size);

// Add all elements to the Tally hash
foreach set in sets
{
    foreach e in set
    {
        if (Tally.Exists(e))
            Tally[e]++;
        else
            Tally.Add(e,1);
    }
}

//Now, find the Tally entries that match the number of sets
foreach kvp in Tally.KeyValuePairs
{
    If (kvp.Value == sets.Count)
        // add the Key to output list/set
        Output.Add(kvp.Key);
}

This has run-time complexity O(n) Where "n" is the number of elements in all sets.




回答3:


I'll assume for the moment that the sets are represented as lists and that they start out unsorted.

(Edited to conform my symbols to those of @perreal)

Given a total of m*N items in the N sets, one could concatenate the sets into a single list (m*N operations), sort the list (m*N log m*N operations) and then run through the sorted list, retaining any item in the list that has exactly N copies (another m*N operations), giving a total (I think) of m*N (2 + log m*N) operations for any case.

By comparison, assuming each set has the same number of items m, I think @perreal's solution would be a maximum of m^2*N operations if the sets were all identical. That would require more than my algorithm's m*N (2 + log m*N) operations for large values of m*N. However, in the best case, @perreal's solution would require as few as 2m*N operations (if the first and second sets tested had no intersection).

@perreal's solution would also require fewer operations for cases where the intersection was small if the sets were compared in increasing order of size, with S_1 being the smallest set.

If the sets started out as sorted lists, both solutions would be faster because there would be no need for the initial sort for my algorithm and @perreal's algorithm could decide that an element was not in the set without having to search through the entire set.



来源:https://stackoverflow.com/questions/15264544/algorithm-model-for-intersection-of-several-sets

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!