Intersection complexity

前端 未结 3 1226
小蘑菇
小蘑菇 2020-12-09 10:33

In Python you can get the intersection of two sets doing:

>>> s1 = {1, 2, 3, 4, 5, 6, 7, 8, 9}
>>> s2 = {0, 3, 5, 6, 10}
>>> s1 &a         


        
相关标签:
3条回答
  • 2020-12-09 10:52

    Set intersection of two sets of sizes m,n can be achieved with O(max{m,n} * log(min{m,n})) in the following way: Assume m << n

    1. Represent the two sets as list/array(something sortable)
    2. Sort the **smaller** list/array (cost: m*logm)
    3. Do until all elements in the bigger list has been checked:
        3.1 Sort the next **m** items on the bigger list(cost: m*logm)
        3.2 With a single pass compare the smaller list and the m items you just sorted and take the ones that appear in both of them(cost: m)
    4. Return the new set
    

    The loop in step 3 will run for n/m iterations and each iteration will take O(m*logm), so you will have time complexity of O(nlogm) for m << n.

    I think that's the best lower bound that exists

    0 讨论(0)
  • 2020-12-09 10:53

    The intersection algorithm always runs at O(min(len(s1), len(s2))).

    In pure Python, it looks like this:

        def intersection(self, other):
            if len(self) <= len(other):
                little, big = self, other
            else:
                little, big = other, self
            result = set()
            for elem in little:
                if elem in big:
                    result.add(elem)
            return result
    

    [Answer to the question in the additional edit] The data structure behind sets is a hash table.

    0 讨论(0)
  • 2020-12-09 11:00

    The answer appears to be a search engine query away. You can also use this direct link to the Time Complexity page at python.org. Quick summary:

    Average:     O(min(len(s), len(t))
    Worst case:  O(len(s) * len(t))
    

    EDIT: As Raymond points out below, the "worst case" scenario isn't likely to occur. I included it originally to be thorough, and I'm leaving it to provide context for the discussion below, but I think Raymond's right.

    0 讨论(0)
提交回复
热议问题