How can I find a number which occurs an odd number of times in a SORTED array in O(n) time?

后端 未结 15 1987
梦如初夏
梦如初夏 2021-01-30 10:31

I have a question and I tried to think over it again and again... but got nothing so posting the question here. Maybe I could get some view-point of others, to try and make it w

相关标签:
15条回答
  • 2021-01-30 11:35

    You can create a cummulative array and count how much each number occur and then in the cummulative array find the element which is odd. Example:

    int a[]=new int[]{2,3,4,2,3,1,4,5,6,5,6,7,1};
    int b[]=new int[1000];
    for (int i=0;i<b.length;i++) {
        b[i]=0;
    }
    for (Int i=0;i<a.length;i++) {
        b[a[i]]++;
    }
    for (int i=0;i<b.length;i++) {
        if ( b[i]!=0) {
            if (b[i] %2==0) {
              system.out.println(i);  break;
    
        }
    }
    
    0 讨论(0)
  • 2021-01-30 11:37

    Theorem: Every deterministic algorithm for this problem probes Ω(log2 n) memory locations in the worst case.

    Proof (completely rewritten in a more formal style):

    Let k > 0 be an odd integer and let n = k2. We describe an adversary that forces (log2 (k + 1))2 = Ω(log2 n) probes.

    We call the maximal subsequences of identical elements groups. The adversary's possible inputs consist of k length-k segments x1 x2 … xk. For each segment xj, there exists an integer bj ∈ [0, k] such that xj consists of bj copies of j - 1 followed by k - bj copies of j. Each group overlaps at most two segments, and each segment overlaps at most two groups.

    Group boundaries
    |   |     |   |   |
     0 0 1 1 1 2 2 3 3
    |     |     |     |
    Segment boundaries
    

    Wherever there is an increase of two, we assume a double boundary by convention.

    Group boundaries
    |     ||       |   |
     0 0 0  2 2 2 2 3 3
    

    Claim: The location of the jth group boundary (1 ≤ j ≤ k) is uniquely determined by the segment xj.

    Proof: It's just after the ((j - 1) k + bj)th memory location, and xj uniquely determines bj. //

    We say that the algorithm has observed the jth group boundary in case the results of its probes of xj uniquely determine xj. By convention, the beginning and the end of the input are always observed. It is possible for the algorithm to uniquely determine the location of a group boundary without observing it.

    Group boundaries
    |   X   |   |     |
     0 0 ? 1 2 2 3 3 3
    |     |     |     |
    Segment boundaries
    

    Given only 0 0 ?, the algorithm cannot tell for sure whether ? is a 0 or a 1. In context, however, ? must be a 1, as otherwise there would be three odd groups, and the group boundary at X can be inferred. These inferences could be problematic for the adversary, but it turns out that they can be made only after the group boundary in question is "irrelevant".

    Claim: At any given point during the algorithm's execution, consider the set of group boundaries that it has observed. Exactly one consecutive pair is at odd distance, and the odd group lies between them.

    Proof: Every other consecutive pair bounds only even groups. //

    Define the odd-length subsequence bounded by the special consecutive pair to be the relevant subsequence.

    Claim: No group boundary in the interior of the relevant subsequence is uniquely determined. If there is at least one such boundary, then the identity of the odd group is not uniquely determined.

    Proof: Without loss of generality, assume that each memory location not in the relevant subsequence has been probed and that each segment contained in the relevant subsequence has exactly one location that has not been probed. Suppose that the jth group boundary (call it B) lies in the interior of the relevant subsequence. By hypothesis, the probes to xj determine B's location up to two consecutive possibilities. We call the one at odd distance from the left observed boundary odd-left and the other odd-right. For both possibilities, we work left to right and fix the location of every remaining interior group boundary so that the group to its left is even. (We can do this because they each have two consecutive possibilities as well.) If B is at odd-left, then the group to its left is the unique odd group. If B is at odd-right, then the last group in the relevant subsequence is the unique odd group. Both are valid inputs, so the algorithm has uniquely determined neither the location of B nor the odd group. //

    Example:

    Observed group boundaries; relevant subsequence marked by […]
    [             ]   |
     0 0 Y 1 1 Z 2 3 3
    |     |     |     |
    Segment boundaries
    
    Possibility #1: Y=0, Z=2
    Possibility #2: Y=1, Z=2
    Possibility #3: Y=1, Z=1
    

    As a consequence of this claim, the algorithm, regardless of how it works, must narrow the relevant subsequence to one group. By definition, it therefore must observe some group boundaries. The adversary now has the simple task of keeping open as many possibilities as it can.

    At any given point during the algorithm's execution, the adversary is internally committed to one possibility for each memory location outside of the relevant subsequence. At the beginning, the relevant subsequence is the entire input, so there are no initial commitments. Whenever the algorithm probes an uncommitted location of xj, the adversary must commit to one of two values: j - 1, or j. If it can avoid letting the jth boundary be observed, it chooses a value that leaves at least half of the remaining possibilities (with respect to observation). Otherwise, it chooses so as to keep at least half of the groups in the relevant interval and commits values for the others.

    In this way, the adversary forces the algorithm to observe at least log2 (k + 1) group boundaries, and in observing the jth group boundary, the algorithm is forced to make at least log2 (k + 1) probes.


    Extensions:

    This result extends straightforwardly to randomized algorithms by randomizing the input, replacing "at best halved" (from the algorithm's point of view) with "at best halved in expectation", and applying standard concentration inequalities.

    It also extends to the case where no group can be larger than s copies; in this case the lower bound is Ω(log n log s).

    0 讨论(0)
  • 2021-01-30 11:37

    Look at the middle element of the array. With a couple of appropriate binary searches, you can find the first and its last appearance in the array. E.g., if the middle element is 'a', you need to find i and j as shown below:

    [* * * * a a a a * * *]
             ^     ^ 
             |     |
             |     |
             i     j
    

    Is j - i an even number? You are done! Otherwise (and this is the key here), the question to ask is i an even or an odd number? Do you see what this piece of knowledge implies? Then the rest is easy.

    0 讨论(0)
提交回复
热议问题