Algorithm to find two repeated numbers in an array, without sorting

前端 未结 24 1998
南方客
南方客 2020-11-28 06:58

There is an array of size n (numbers are between 0 and n - 3) and only 2 numbers are repeated. Elements are placed randomly in the array.

E.g. in {2, 3, 6, 1, 5, 4

相关标签:
24条回答
  • 2020-11-28 07:23

    Why should we try out doing maths ( specially solving quadratic equations ) these are costly op . Best way to solve this would be t construct a bitmap of size (n-3) bits , i.e, (n -3 ) +7 / 8 bytes . Better to do a calloc for this memory , so every single bit will be initialized to 0 . Then traverse the list & set the particular bit to 1 when encountered , if the bit is set to 1 already for that no then that is the repeated no . This can be extended to find out if there is any missing no in the array or not. This solution is O(n) in time complexity

    0 讨论(0)
  • 2020-11-28 07:24

    You can use simple nested for loop

     int[] numArray = new int[] { 1, 2, 3, 4, 5, 7, 8, 3, 7 };
    
            for (int i = 0; i < numArray.Length; i++)
            {
                for (int j = i + 1; j < numArray.Length; j++)
                {
                    if (numArray[i] == numArray[j])
                    {
                       //DO SOMETHING
                    }
                }
    

    *OR you can filter the array and use recursive function if you want to get the count of occurrences*

    int[] array = { 1, 2, 3, 4, 5, 4, 4, 1, 8, 9, 23, 4, 6, 8, 9, 1,4 };
    int[] myNewArray = null;
    int a = 1;
    
     void GetDuplicates(int[] array)
        for (int i = 0; i < array.Length; i++)
                {
                    for (int j = i + 1; j < array.Length; j++)
                    {
                        if (array[i] == array[j])
                        {
                              a += 1;
                        }
                    }
                    Console.WriteLine(" {0} occurred {1} time/s", array[i], a);
    
                    IEnumerable<int> num = from n in array where n != array[i] select n;
                     myNewArray = null;
                     a = 1;
                     myNewArray = num.ToArray() ;
    
                     break;
    
                }
                 GetDuplicates(myNewArray);
    
    0 讨论(0)
  • 2020-11-28 07:28
    for(i=1;i<=n;i++) {
      if(!(arr[i] ^ arr[i+1]))
            printf("Found Repeated number %5d",arr[i]);
    }
    
    0 讨论(0)
  • 2020-11-28 07:33

    I know the question is very old but I suddenly hit it and I think I have an interesting answer to it. We know this is a brainteaser and a trivial solution (i.e. HashMap, Sort, etc) no matter how good they are would be boring.

    As the numbers are integers, they have constant bit size (i.e. 32). Let us assume we are working with 4 bit integers right now. We look for A and B which are the duplicate numbers.

    We need 4 buckets, each for one bit. Each bucket contains numbers which its specific bit is 1. For example bucket 1 gets 2, 3, 4, 7, ...:

    Bucket 0 : Sum ( x where: x & 2 power 0 == 0 )
    ...
    Bucket i : Sum ( x where: x & 2 power i == 0 )
    

    We know what would be the sum of each bucket if there was no duplicate. I consider this as prior knowledge.

    Once above buckets are generated, a bunch of them would have values more than expected. By constructing the number from buckets we will have (A OR B for your information).

    We can calculate (A XOR B) as follows:

    A XOR B = Array[i] XOR Array[i-1] XOR ... 0, XOR n-3 XOR n-2  ... XOR 0
    

    Now going back to buckets, we know exactly which buckets have both our numbers and which ones have only one (from the XOR bit).

    For the buckets that have only one number we can extract the number num = (sum - expected sum of bucket). However, we should be good only if we can find one of the duplicate numbers so if we have at least one bit in A XOR B, we've got the answer.

    But what if A XOR B is zero? Well this case is only possible if both duplicate numbers are the same number, which then our number is the answer of A OR B.

    0 讨论(0)
  • 2020-11-28 07:33

    Here's implementation in Python of @eugensk00's answer (one of its revisions) that doesn't use modular arithmetic. It is a single-pass algorithm, O(log(n)) in space. If fixed-width (e.g. 32-bit) integers are used then it is requires only two fixed-width numbers (e.g. for 32-bit: one 64-bit number and one 128-bit number). It can handle arbitrary large integer sequences (it reads one integer at a time therefore a whole sequence doesn't require to be in memory).

    def two_repeated(iterable):
        s1, s2 = 0, 0
        for i, j in enumerate(iterable):
            s1 += j - i     # number_of_digits(s1) ~ 2 * number_of_digits(i)
            s2 += j*j - i*i # number_of_digits(s2) ~ 4 * number_of_digits(i) 
        s1 += (i - 1) + i
        s2 += (i - 1)**2 + i**2
    
        p = (s1 - int((2*s2 - s1**2)**.5)) // 2 
        # `Decimal().sqrt()` could replace `int()**.5` for really large integers
        # or any function to compute integer square root
        return p, s1 - p
    

    Example:

    >>> two_repeated([2, 3, 6, 1, 5, 4, 0, 3, 5])
    (3, 5)
    

    A more verbose version of the above code follows with explanation:

    def two_repeated_seq(arr):
        """Return the only two duplicates from `arr`.
    
        >>> two_repeated_seq([2, 3, 6, 1, 5, 4, 0, 3, 5])
        (3, 5)
        """
        n = len(arr)
        assert all(0 <= i < n - 2 for i in arr) # all in range [0, n-2)
        assert len(set(arr)) == (n - 2) # number of unique items
    
        s1 = (n-2) + (n-1)       # s1 and s2 have ~ 2*(k+1) and 4*(k+1) digits  
        s2 = (n-2)**2 + (n-1)**2 # where k is a number of digits in `max(arr)`
        for i, j in enumerate(arr):
            s1 += j - i     
            s2 += j*j - i*i
    
        """
        s1 = (n-2) + (n-1) + sum(arr) - sum(range(n))
           = sum(arr) - sum(range(n-2))
           = sum(range(n-2)) + p + q - sum(range(n-2))
           = p + q
        """
        assert s1 == (sum(arr) - sum(range(n-2)))
    
        """
        s2 = (n-2)**2 + (n-1)**2 + sum(i*i for i in arr) - sum(i*i for i in range(n))
           = sum(i*i for i in arr) - sum(i*i for i in range(n-2))
           = p*p + q*q
        """
        assert s2 == (sum(i*i for i in arr) - sum(i*i for i in range(n-2)))
    
        """
        s1 = p+q
        -> s1**2 = (p+q)**2
        -> s1**2 = p*p + 2*p*q + q*q
        -> s1**2 - (p*p + q*q) = 2*p*q
        s2 = p*p + q*q
        -> p*q = (s1**2 - s2)/2
    
        Let C = p*q = (s1**2 - s2)/2 and B = p+q = s1 then from Viete theorem follows
        that p and q are roots of x**2 - B*x + C = 0
        -> p = (B + sqrtD) / 2
        -> q = (B - sqrtD) / 2
        where sqrtD = sqrt(B**2 - 4*C)
    
        -> p = (s1 + sqrt(2*s2 - s1**2))/2
        """
        sqrtD = (2*s2 - s1**2)**.5
        assert int(sqrtD)**2 == (2*s2 - s1**2) # perfect square
        sqrtD = int(sqrtD)
        assert (s1 - sqrtD) % 2 == 0 # even
        p = (s1 - sqrtD) // 2
        q = s1 - p
        assert q == ((s1 + sqrtD) // 2)
        assert sqrtD == (q - p)
        return p, q
    

    NOTE: calculating integer square root of a number (~ N**4) makes the above algorithm non-linear.

    0 讨论(0)
  • 2020-11-28 07:33

    Here is an algorithm that uses order statistics and runs in O(n).

    You can solve this by repeatedly calling SELECT with the median as parameter.

    You also rely on the fact that After a call to SELECT, the elements that are less than or equal to the median are moved to the left of the median.

    • Call SELECT on A with the median as the parameter.
    • If the median value is floor(n/2) then the repeated values are right to the median. So you continue with the right half of the array.
    • Else if it is not so then a repeated value is left to the median. So you continue with the left half of the array.
    • You continue this way recursively.

    For example:

    • When A={2, 3, 6, 1, 5, 4, 0, 3, 5} n=9, then the median should be the value 4.
    • After the first call to SELECT
    • A={3, 2, 0, 1, <3>, 4, 5, 6, 5} The median value is smaller than 4 so we continue with the left half.
    • A={3, 2, 0, 1, 3}
    • After the second call to SELECT
    • A={1, 0, <2>, 3, 3} then the median should be 2 and it is so we continue with the right half.
    • A={3, 3}, found.

    This algorithm runs in O(n+n/2+n/4+...)=O(n).

    0 讨论(0)
提交回复
热议问题