I have been asked this question in a job interview and I have been wondering about the right answer.
You have an array of numbers from 0 to n-1, one o
This can be done in O(n) time and O(1) space. Without modifying the input array
slow = a[0]
fast = a[a[0]]
slow = 0
while(slow != fast){
slow = a[slow];
fast = a[fast];
}
Here's a Java implementation:
class Solution {
public int findDuplicate(int[] nums) {
if(nums.length <= 1) return -1;
int slow = nums[0], fast = nums[nums[0]]; //slow = head.next, fast = head.next.next
while(slow != fast){ //check for loop
slow = nums[slow];
fast = nums[nums[fast]];
}
if(slow != fast) return -1;
slow = 0; //reset one pointer
while(slow != fast){ //find starting point of loop
slow = nums[slow];
fast = nums[fast];
}
return slow;
}
}
This is an alternative solution in O(n)
time and O(1)
space. It is similar to rici's. I find it a bit easier to understand but, in practice, it will overflow faster.
Let X
be the missing number and R
be the repeated number.
We can assume the numbers are from [1..n]
, i.e. zero does not appear. In fact, while looping through the array, we can test if zero was found and return immediately if not.
Now consider:
sum(A) = n (n + 1) / 2 - X + R
product(A) = n! R / X
where product(A)
is the product of all element in A
skipping the zero. We have two equations in two unknowns from which X
and R
can be derived algebraically.
Edit: by popular demand, here is a worked-out example:
Let's set:
S = sum(A) - n (n + 1) / 2
P = n! / product(A)
Then our equations become:
R - X = S
X = R P
which can be solved to:
R = S / (1 - P)
X = P R = P S / (1 - P)
Example:
A = [0 1 2 2 4]
n = A.length - 1 = 4
S = (1 + 2 + 2 + 4) - 4 * 5 / 2 = -1
P = 4! / (1 * 2 * 2 * 4) = 3 / 2
R = -1 / (1 - 3/2) = -1 / -1/2 = 2
X = 3/2 * 2 = 3
int[] a = {5, 6, 8, 9, 3, 4, 2, 9 };
int[] b = {5, 6, 8, 9, 3, 6, 1, 9 };
for (int i = 0; i < a.Length; i++)
{
if (a[i] != b[i])
{
Console.Write("Original Array manipulated at position {0} + "\t\n"
+ "and the element is {1} replaced by {2} ", i,
a[i],b[i] + "\t\n" );
break;
}
}
Console.Read();
///use break if want to check only one manipulation in original array.
///If want to check more then one manipulation in original array, remove break
I suggest using a BitSet. We know N is small enough for array indexing, so the BitSet will be of reasonable size.
For each element of the array, check the bit corresponding to its value. If it is already set, that is the duplicate. If not, set the bit.
You can do it in O(N) time without any extra space. Here is how the algorithm works :
Iterate through array in the following manner :
For each element encountered, set its corresponding index value to negative. Eg : if you find a[0] = 2. Got to a[2] and negate the value.
By doing this you flag it to be encountered. Since you know you cannot have negative numbers, you also know that you are the one who negated it.
Check if index corresponding to the value is already flagged negative, if yes you get the duplicated element. Eg : if a[0]=2 , go to a[2] and check if it is negative.
Lets say you have following array :
int a[] = {2,1,2,3,4};
After first element your array will be :
int a[] = {2,1,-2,3,4};
After second element your array will be :
int a[] = {2,-1,-2,3,4};
When you reach third element you go to a[2] and see its already negative. You get the duplicate.
@rici is right about the time and space usage: "This can be done in O(n) time and O(1) space."
However, the question can be expanded to broader requirement: it's not necessary that there is only one duplicate number, and numbers might not be consecutive.
OJ puts it this way here: (note 3 apparently can be narrowed)
Given an array nums containing n + 1 integers where each integer is between 1 and n (inclusive), prove that at least one duplicate number must exist. Assume that there is only one duplicate number, find the duplicate one.
Note:
- You must not modify the array (assume the array is read only).
- You must use only constant, O(1) extra space.
- Your runtime complexity should be less than O(n2).
- There is only one duplicate number in the array, but it could be repeated more than once.
The question is very well explained and answered here by Keith Schwarz, using Floyd's cycle-finding algorithm:
The main trick we need to use to solve this problem is to notice that because we have an array of n elements ranging from 0 to n - 2, we can think of the array as defining a function f from the set {0, 1, ..., n - 1} onto itself. This function is defined by f(i) = A[i]. Given this setup, a duplicated value corresponds to a pair of indices i != j such that f(i) = f(j). Our challenge, therefore, is to find this pair (i, j). Once we have it, we can easily find the duplicated value by just picking f(i) = A[i].
But how are we to find this repeated value? It turns out that this is a well-studied problem in computer science called cycle detection. The general form of the problem is as follows. We are given a function f. Define the sequence x_i as
x_0 = k (for some k)
x_1 = f(x_0)
x_2 = f(f(x_0))
...
x_{n+1} = f(x_n)
Assuming that f maps from a domain into itself, this function will have one of three forms. First, if the domain is infinite, then the sequence could be infinitely long and nonrepeating. For example, the function f(n) = n + 1 on the integers has this property - no number is ever duplicated. Second, the sequence could be a closed loop, which means that there is some i so that x_0 = x_i. In this case, the sequence cycles through some fixed set of values indefinitely. Finally, the sequence could be "rho-shaped." In this case, the sequence looks something like this:
x_0 -> x_1 -> ... x_k -> x_{k+1} ... -> x_{k+j}
^ |
| |
+-----------------------+
That is, the sequence begins with a chain of elements that enters a cycle, then cycles around indefinitely. We'll denote the first element of the cycle that is reached in the sequence the "entry" of the cycle.
An python implementation can also be found here:
def findDuplicate(self, nums):
# The "tortoise and hare" step. We start at the end of the array and try
# to find an intersection point in the cycle.
slow = 0
fast = 0
# Keep advancing 'slow' by one step and 'fast' by two steps until they
# meet inside the loop.
while True:
slow = nums[slow]
fast = nums[nums[fast]]
if slow == fast:
break
# Start up another pointer from the end of the array and march it forward
# until it hits the pointer inside the array.
finder = 0
while True:
slow = nums[slow]
finder = nums[finder]
# If the two hit, the intersection index is the duplicate element.
if slow == finder:
return slow