How can I find a number which occurs an odd number of times in a SORTED array in O(n) time?

后端未结

关注

 15  2038

I have a question and I tried to think over it again and again... but got nothing so posting the question here. Maybe I could get some view-point of others, to try and make it w

相关标签:

15条回答

情书的邮戳

2021-01-30 11:35

You can create a cummulative array and count how much each number occur and then in the cummulative array find the element which is odd. Example:

int a[]=new int[]{2,3,4,2,3,1,4,5,6,5,6,7,1};
int b[]=new int[1000];
for (int i=0;i<b.length;i++) {
    b[i]=0;
}
for (Int i=0;i<a.length;i++) {
    b[a[i]]++;
}
for (int i=0;i<b.length;i++) {
    if ( b[i]!=0) {
        if (b[i] %2==0) {
          system.out.println(i);  break;

    }
}

0 讨论(0)

日久生厌

2021-01-30 11:37
Theorem: Every deterministic algorithm for this problem probes Ω(log² n) memory locations in the worst case.

Proof (completely rewritten in a more formal style):

Let k > 0 be an odd integer and let n = k². We describe an adversary that forces (log₂ (k + 1))² = Ω(log² n) probes.

We call the maximal subsequences of identical elements groups. The adversary's possible inputs consist of k length-k segments x₁ x₂ … x_k. For each segment x_j, there exists an integer b_j ∈ [0, k] such that x_j consists of b_j copies of j - 1 followed by k - b_j copies of j. Each group overlaps at most two segments, and each segment overlaps at most two groups.
```
Group boundaries
|   |     |   |   |
 0 0 1 1 1 2 2 3 3
|     |     |     |
Segment boundaries
```
Wherever there is an increase of two, we assume a double boundary by convention.
```
Group boundaries
|     ||       |   |
 0 0 0  2 2 2 2 3 3
```
Claim: The location of the j^th group boundary (1 ≤ j ≤ k) is uniquely determined by the segment x_j.

Proof: It's just after the ((j - 1) k + b_j)^th memory location, and x_j uniquely determines b_j. //

We say that the algorithm has observed the j^th group boundary in case the results of its probes of x_j uniquely determine x_j. By convention, the beginning and the end of the input are always observed. It is possible for the algorithm to uniquely determine the location of a group boundary without observing it.
```
Group boundaries
|   X   |   |     |
 0 0 ? 1 2 2 3 3 3
|     |     |     |
Segment boundaries
```
Given only 0 0 ?, the algorithm cannot tell for sure whether ? is a 0 or a 1. In context, however, ? must be a 1, as otherwise there would be three odd groups, and the group boundary at X can be inferred. These inferences could be problematic for the adversary, but it turns out that they can be made only after the group boundary in question is "irrelevant".

Claim: At any given point during the algorithm's execution, consider the set of group boundaries that it has observed. Exactly one consecutive pair is at odd distance, and the odd group lies between them.

Proof: Every other consecutive pair bounds only even groups. //

Define the odd-length subsequence bounded by the special consecutive pair to be the relevant subsequence.

Claim: No group boundary in the interior of the relevant subsequence is uniquely determined. If there is at least one such boundary, then the identity of the odd group is not uniquely determined.

Proof: Without loss of generality, assume that each memory location not in the relevant subsequence has been probed and that each segment contained in the relevant subsequence has exactly one location that has not been probed. Suppose that the j^th group boundary (call it B) lies in the interior of the relevant subsequence. By hypothesis, the probes to x_j determine B's location up to two consecutive possibilities. We call the one at odd distance from the left observed boundary odd-left and the other odd-right. For both possibilities, we work left to right and fix the location of every remaining interior group boundary so that the group to its left is even. (We can do this because they each have two consecutive possibilities as well.) If B is at odd-left, then the group to its left is the unique odd group. If B is at odd-right, then the last group in the relevant subsequence is the unique odd group. Both are valid inputs, so the algorithm has uniquely determined neither the location of B nor the odd group. //

Example:
```
Observed group boundaries; relevant subsequence marked by […]
[             ]   |
 0 0 Y 1 1 Z 2 3 3
|     |     |     |
Segment boundaries

Possibility #1: Y=0, Z=2
Possibility #2: Y=1, Z=2
Possibility #3: Y=1, Z=1
```
As a consequence of this claim, the algorithm, regardless of how it works, must narrow the relevant subsequence to one group. By definition, it therefore must observe some group boundaries. The adversary now has the simple task of keeping open as many possibilities as it can.

At any given point during the algorithm's execution, the adversary is internally committed to one possibility for each memory location outside of the relevant subsequence. At the beginning, the relevant subsequence is the entire input, so there are no initial commitments. Whenever the algorithm probes an uncommitted location of x_j, the adversary must commit to one of two values: j - 1, or j. If it can avoid letting the j^th boundary be observed, it chooses a value that leaves at least half of the remaining possibilities (with respect to observation). Otherwise, it chooses so as to keep at least half of the groups in the relevant interval and commits values for the others.

In this way, the adversary forces the algorithm to observe at least log₂ (k + 1) group boundaries, and in observing the j^th group boundary, the algorithm is forced to make at least log₂ (k + 1) probes.

Extensions:

This result extends straightforwardly to randomized algorithms by randomizing the input, replacing "at best halved" (from the algorithm's point of view) with "at best halved in expectation", and applying standard concentration inequalities.

It also extends to the case where no group can be larger than s copies; in this case the lower bound is Ω(log n log s).
0 讨论(0)
发布评论:

提交评论
- 加载中...
猫巷女王i

2021-01-30 11:37
Look at the middle element of the array. With a couple of appropriate binary searches, you can find the first and its last appearance in the array. E.g., if the middle element is 'a', you need to find i and j as shown below:
```
[* * * * a a a a * * *]
         ^     ^ 
         |     |
         |     |
         i     j
```
Is j - i an even number? You are done! Otherwise (and this is the key here), the question to ask is i an even or an odd number? Do you see what this piece of knowledge implies? Then the rest is easy.
0 讨论(0)
发布评论:

提交评论
- 加载中...

上一页 1 2 3