Determining if an array has a k-majority element

后端 未结 3 867
天涯浪人
天涯浪人 2021-01-02 08:47

Suppose that, given an n-element multiset A (not sorted), we want an O(n) time algorithm for determining whether A contains a majority element, i.e., an element that occurs

相关标签:
3条回答
  • 2021-01-02 09:19

    I don't know if you've seen this one, but it may help to give you ideas:

    Suppose you know there is a majority element in an array L.

    One way to find the element is as follows:

    Def FindMajorityElement(L):
    
        Count = 0
    
        Foreach X in L
    
            If Count == 0
                Y = X
    
            If X == Y
                Count = Count + 1
            Else
                Count = Count - 1
    
        Return Y
    

    O(n) time, O(1) space

    0 讨论(0)
  • 2021-01-02 09:22

    The method that you described just needs to be used recursively.

    Remembering that select moves the elements that are less or equal to the median to the left of the median.

    If A is of size n.

    Find the median of A. Now find the median of each of the two sub multi-sets of length n/2 that were partitioned by the median. Find the median of each of the four sub multi-sets of length n/4 that were partitioned by the medians. Continue recursively until the leaves are of length n/k. Now the height of the recursive tree is O(lgk). On each level of the recursive tree, there are O(n) operations. If there exist a value that is repeated at least n/k times then it will be in one of these k with length of n/k sub multi-sets. The last operations is also done in O(n). So you get the requested running time of O(nlgk).

    0 讨论(0)
  • 2021-01-02 09:26

    O(kn) algorithm

    I wonder if perhaps the O(kn) algorithm might be more along the lines of:

    1. Find k regularly spaced elements (using a similar linear select algorithm to the median)
    2. Count how many matches you get for each of these

    With the idea being that if an element occurs n/k times, it must be one of these.

    O(nlogk) algorithm

    Perhaps you could use the scheme proposed in your question together with a tree structure to hold the k elements. This would then mean that the search for a match would only be log(k) instead of k, for an overall O(nlogk)?

    Note that you should use the tree for both the first pass (where you are finding k candidates that we need to consider) and for the second pass of computing the exact counts for each element.

    Also note that you would probably want to use a lazy evaluation scheme for decrementing the counters (i.e. mark whole subtrees that need to be decremented and propagate the decrements only when that path is next used).

    O(n) algorithm

    If you encounter this in real life, I would consider using a hash based dictionary to store the histogram as this should give a fast solution.

    e.g. in Python you could solve this in (on average) O(n) time using

    from collections import Counter
    A=[4,2,7,4,6]
    k=3
    
    element,count = Counter(A).most_common()[0]
    
    if count>=len(A)//k:
        print element
    else:
        print "there is no majority"
    
    0 讨论(0)
提交回复
热议问题