Find a number where it appears exactly N/2 times

后端 未结 20 1889
旧巷少年郎
旧巷少年郎 2021-01-29 23:17

Here is one of my interview question. Given an array of N elements and where an element appears exactly N/2 times and the rest N/2 elements are unique

相关标签:
20条回答
  • 2021-01-29 23:42

    I think you simply need to parse through the array keeping a backlog of two elements. As N/2 are equal and the rest is guaranteed to be distinct there must be one place i in your array where

    a[i] == a[i-1] OR a[i] == a[i-2]

    iterate once through your array and you have complexity of roughly 2*N which should be well inside O(N).

    This answer is somewhat similar to the answer by Ganesh M and Dougie, but I think a little simpler.

    0 讨论(0)
  • 2021-01-29 23:42

    Peter is exactly right. Here is a more formal way of restating his proof:

    Let set S be a set containing N elements. It is the union of two sets: p, which contains a symbol α repeated N/2 times, and q, which contains N/2 unique symbols ω1..ωn/2. S = p ∪ q.

    Assume there is an algorithm that can detect your duplicated number in log(n) comparisons in the worst case for all N > 2. In the worst case means that there does not exist any subset r ⊂ S such that |r| = log2 N where α ∉ r.

    However because S = p ∪ q, there are |p| many elements ≠ α in S. |p| = N/2, so ∀ N/2 such that N/2 ≥ log2N, there must exist at least one set r ⊂ S such that |r| = log2N and α ∉ r. This is the case for any N ≥ 3. This contradicts the assumption above, so there cannot be any such algorithm.

    QED.

    0 讨论(0)
  • 2021-01-29 23:42

    The answer is straightforward.. and can be achieved in worst case (n/2 + 1) comparisons

    1. Compare pairwise first (n-2) numbers, that is, compare nos. at 0 and 1, then 2 and 3 and so on... total n/2 -1 comparisons. If we find identical numbers in any of the above comparisons.. we have the repeated number... else:

    2. Take any one of the last two remaining numbers (say second last one I took) and compare it with the numbers in the second last pair.. if match occurs..second last no. is the repated one, else last one is the repeated one... in all 2 comparisons.

    Total comparisons = n/2 - 1 + 2 =n/2 + 1 (worst case) I dont think there is any O(log n) method to achieve this

    0 讨论(0)
  • 2021-01-29 23:43

    To do it less than O(n) you would have to not read all the numbers.
    If you know there is a value that satisifies the relationship then you could just sample a small subset an show that only one number appears enough times to meet the relationship. You would have to assume the values are reasonably uniformly distributed

    Edit. you would have to read n/2 to prove that such a number existed, but if you knew a number existed and only wanted to find it - you could read sqrt(n) samples

    0 讨论(0)
  • 2021-01-29 23:45

    For worst-case deterministic behavior, O(N) is correct (I've already seen more than one proof in the previous answers).

    However, modern algorithmic theory isn't concerned just with worst-case behavior (that's why there are so many other big-somethings besides big-O, even though lazy programmers in a hurry often use big-O even when what they have in mind is closer to big-theta OR big-omega;-), nor just with determinism (withness the Miller-Rabin primality test...;).

    Any random sample of K < N items will show no duplicates with a probabllity that < 2**K -- easily and rapidly reduced to essentially as low as you wish no matter what N is (e.g. you could reduce it to less than the probability that a random cosmic ray will accidentally and undetectably flip a bit in your memory;-) -- this observation hardly requires the creativity Rabin and Miller needed to find their probabilistic prime testing approach;-).

    This would make a pretty lousy interview question. Similar less-lousy questions are often posed, often mis-answered, and often mis-remembered by unsuccessful candidates. For example, a typical question might be, given an array of N items, not knowing whether there is a majority item, to determine whether there is one, and which one it is, in O(N) time and O(1) auxiliary space (so you can't just set up a hash table or something to count occurrences of different values). "Moore's Voting Approach" is a good solution (probably the best one) to that worthy interviewing question.

    Another interesting variation: what if you have 10**18 64-bit numbers (8 Terabytes' worth of data overall, say on a bigtable or clone thereof), and as many machines as you want, each with about 4GB of RAM on a pretty fast LAN, say one that's substantially better than GB ethernet -- how do you shard the problem under those conditions? What if you have to use mapreduce / hadoop? What if you're free to design your own dedicated framework just for this one problem -- could you get better performance than with mapreduce? How much better, at the granularity of back-of-envelope estimation? I know of no published algorithm for THIS variant, so it may be a great test if you want to check general facility of a candidate with highly-distributed approaches to tera-scale computation...

    0 讨论(0)
  • 2021-01-29 23:46

    Here is Don Johe's answer in Ruby:

    #!/usr/bin/ruby1.8
    
    def find_repeated_number(a)
      return nil unless a.size >= 3
      (0..a.size - 3).each do |i|
        [
          [0, 1],
          [0, 2],
          [1, 2],
        ].each do |j1, j2|
          return a[i + j1] if a[i + j1] == a[i + j2]
        end
      end
    end
    
    p find_repeated_number([1, 1, 2])   # => 1
    p find_repeated_number([2, 3, 2])   # => 1
    p find_repeated_number([4, 3, 3])   # => 1
    

    O(n)

    0 讨论(0)
提交回复
热议问题