How can I find a number which occurs an odd number of times in a SORTED array in O(n) time?

后端 未结 15 1986
梦如初夏
梦如初夏 2021-01-30 10:31

I have a question and I tried to think over it again and again... but got nothing so posting the question here. Maybe I could get some view-point of others, to try and make it w

相关标签:
15条回答
  • 2021-01-30 11:11

    Start at the middle of the array and walk backward until you get to a value that's different from the one at the center. Check whether the number above that boundary is at an odd or even index. If it's odd, then the number occurring an odd number of times is to the left, so repeat your search between the beginning and the boundary you found. If it's even, then the number occurring an odd number of times must be later in the array, so repeat the search in the right half.

    As stated, this has both a logarithmic and a linear component. If you want to keep the whole thing logarithmic, instead of just walking backward through the array to a different value, you want to use a binary search instead. Unless you expect many repetitions of the same numbers, the binary search may not be worthwhile though.

    0 讨论(0)
  • AHhh. There is an answer.

    Do a binary search and as you search, for each value, move backwards until you find the first entry with that same value. If its index is even, it is before the oddball, so move to the right.
    If its array index is odd, it is after the oddball, so move to the left.

    In pseudocode (this is the general idea, not tested...):

        private static int FindOddBall(int[] ary)
        {
            int l = 0,
                r = ary.Length - 1;
            int n = (l+r)/2;
            while (r > l+2)
            {
                n = (l + r) / 2;
                while (ary[n] == ary[n-1])
                    n = FindBreakIndex(ary, l, n);
                if (n % 2 == 0) // even index we are on or to the left of the oddball 
                    l = n;
                else            // odd index we are to the right of the oddball
                    r = n-1;
            }
            return ary[l];
        }
        private static int FindBreakIndex(int[] ary, int l, int n)
        {
            var t = ary[n];
            var r = n;
            while(ary[n] != t || ary[n] == ary[n-1])
                if(ary[n] == t)
                {
                    r = n;
                    n = (l + r)/2;
                }
                else
                {
                    l = n;
                    n = (l + r)/2;
                }
            return n;
        }
    
    0 讨论(0)
  • 2021-01-30 11:13

    We don't have any information about the distribution of lenghts inside the array, and of the array as a whole, right?

    So the arraylength might be 1, 11, 101, 1001 or something, 1 at least with no upper bound, and must contain at least 1 type of elements ('number') up to (length-1)/2 + 1 elements, for total sizes of 1, 11, 101: 1, 1 to 6, 1 to 51 elements and so on.

    Shall we assume every possible size of equal probability? This would lead to a middle length of subarrays of size/4, wouldn't it?

    An array of size 5 could be divided into 1, 2 or 3 sublists.

    What seems to be obvious is not that obvious, if we go into details.

    An array of size 5 can be 'divided' into one sublist in just one way, with arguable right to call it 'dividing'. It's just a list of 5 elements (aaaaa). To avoid confusion let's assume the elements inside the list to be ordered characters, not numbers (a,b,c, ...).

    Divided into two sublist, they might be (1, 4), (2, 3), (3, 2), (4, 1). (abbbb, aabbb, aaabb, aaaab).

    Now let's look back at the claim made before: Shall the 'division' (5) be assumed the same probability as those 4 divisions into 2 sublists? Or shall we mix them together, and assume every partition as evenly probable, (1/5)?

    Or can we calculate the solution without knowing the probability of the length of the sublists?

    0 讨论(0)
  • 2021-01-30 11:18

    The clue is you're looking for log(n). That's less than n.

    Stepping through the entire array, one at a time? That's n. That's not going to work.

    We know the first two indexes in the array (0 and 1) should be the same number. Same with 50 and 51, if the odd number in the array is after them.

    So find the middle element in the array, compare it to the element right after it. If the change in numbers happens on the wrong index, we know the odd number in the array is before it; otherwise, it's after. With one set of comparisons, we figure out which half of the array the target is in.

    Keep going from there.

    0 讨论(0)
  • 2021-01-30 11:19

    A sorted array suggests a binary search. We have to redefine equality and comparison. Equality simple means an odd number of elements. We can do comparison by observing the index of the first or last element of the group. The first element will be an even index (0-based) before the odd group, and an odd index after the odd group. We can find the first and last elements of a group using binary search. The total cost is O((log N)²).

    PROOF OF O((log N)²)

      T(2) = 1 //to make the summation nice
      T(N) = log(N) + T(N/2) //log(N) is finding the first/last elements
    

    For some N=2^k,

    T(2^k) = (log 2^k) + T(2^(k-1))
           = (log 2^k) + (log 2^(k-1)) + T(2^(k-2))
           = (log 2^k) + (log 2^(k-1)) + (log 2^(k-2)) + ... + (log 2^2) + 1
           = k + (k-1) + (k-2) + ... + 1
           = k(k+1)/2
           = (k² + k)/2
           = (log(N)² + log(N))/ 2
           = O(log(N)²)
    
    0 讨论(0)
  • 2021-01-30 11:19

    I have an algorithm which works in log(N/C)*log(K), where K is the length of maximum same-value range, and C is the length of range being searched for.

    The main difference of this algorithm from most posted before is that it takes advantage of the case where all same-value ranges are short. It finds boundaries not by binary-searching the entire array, but by first quickly finding a rough estimate by jumping back by 1, 2, 4, 8, ... (log(K) iterations) steps, and then binary-searching the resulting range (log(K) again).

    The algorithm is as follows (written in C#):

    // Finds the start of the range of equal numbers containing the index "index", 
    // which is assumed to be inside the array
    // 
    // Complexity is O(log(K)) with K being the length of range
    static int findRangeStart (int[] arr, int index)
    {
        int candidate = index;
        int value = arr[index];
        int step = 1;
    
        // find the boundary for binary search:
        while(candidate>=0 && arr[candidate] == value)
        {
            candidate -= step;
            step *= 2;
        }
    
        // binary search:
        int a = Math.Max(0,candidate);
        int b = candidate+step/2;
        while(a+1!=b)
        {
            int c = (a+b)/2;
            if(arr[c] == value)
                b = c;
            else
                a = c;
        }
        return b;
    }
    
    // Finds the index after the only "odd" range of equal numbers in the array.
    // The result should be in the range (start; end]
    // The "end" is considered to always be the end of some equal number range.
    static int search(int[] arr, int start, int end)
    {
        if(arr[start] == arr[end-1])
            return end;
    
        int middle = (start+end)/2;
    
        int rangeStart = findRangeStart(arr,middle);
    
        if((rangeStart & 1) == 0)
            return search(arr, middle, end);
        return search(arr, start, rangeStart);
    }
    
    // Finds the index after the only "odd" range of equal numbers in the array
    static int search(int[] arr)
    {
        return search(arr, 0, arr.Length);
    }
    
    0 讨论(0)
提交回复
热议问题