given an array of 0s and 1s, find maximum subarray such that number of zeros and 1s are equal. This needs to be done in O(n) time and O(1) space.
I have an algo whic
Different approach but still O(n) time and memory. Start with Neil's suggestion, treat 0 as -1.
Notation: A[0, …, N-1]
- your array of size N
, f(0)=0, f(x)=A[x-1]+f(x-1)
- a function
If you'd plot f
, you'll see, that what you look for are points for which f(m)=f(n), m=n-2k
where k-positive natural. More precisely, only for x
such that A[x]!=A[x+1]
(and the last element in an array) you must check whether f(x)
already occurred. Unfortunately, now I see no improvement over having array B[-N+1…N-1]
where such information would be stored.
To complete my thought: B[x]=-1
initially, B[x]=p
when p = min k: f(k)=x
. And the algorithm is (double-check it, as I'm very tired):
fx = 0
B = new array[-N+1, …, N-1]
maxlen = 0
B[0]=0
for i=1…N-1 :
fx = fx + A[i-1]
if B[fx]==-1 :
B[fx]=i
else if ((i==N-1) or (A[i-1]!=A[i])) and (maxlen < i-B[fx]):
We found that A[B[fx], …, i] is best than what we found so far
maxlen = i-B[fx]
Edit: Two bed-thoughts (= figured out while laying in bed :P ):
1) You could binary search the result by the length of subarray, which would give O(n log n) time and O(1) memory algorithm. Let's use function g(x)=x - x mod 2
(because subarrays which sum to 0 are always of even length). Start by checking, if the whole array sums to 0. If yes -- we're done, otherwise continue. We now assume 0 as starting point (we know there's subarray of such length and "summing-to-zero property") and g(N-1) as ending point (we know there's no such subarray). Let's do
a = 0
b = g(N-1)
while a<b :
c = g((a+b)/2)
check if there is such subarray in O(n) time
if yes:
a = c
if no:
b = c
return the result: a (length of maximum subarray)
Checking for subarray with "summing-to-zero property" of some given length L is simple:
a = 0
b = L
fa = fb = 0
for i=0…L-1:
fb = fb + A[i]
while (fa != fb) and (b<N) :
fa = fa + A[a]
fb = fb + A[b]
a = a + 1
b = b + 1
if b==N:
not found
found, starts at a and stops at b
2) …can you modify input array? If yes and if O(1) memory means exactly, that you use no additional space (except for constant number of elements), then just store your prefix table values in your input array. No more space used (except for some variables) :D
And again, double check my algorithms as I'm veeery tired and could've done off-by-one errors.
Like Neil, I find it useful to consider the alphabet {±1} instead of {0, 1}. Assume without loss of generality that there are at least as many +1s as -1s. The following algorithm, which uses O(sqrt(n log n)) bits and runs in time O(n), is due to "A.F."
Note: this solution does not cheat by assuming the input is modifiable and/or has wasted bits. As of this edit, this solution is the only one posted that is both O(n) time and o(n) space.
A easier version, which uses O(n) bits, streams the array of prefix sums and marks the first occurrence of each value. It then scans backward, considering for each height between 0 and sum(arr) the maximal subarray at that height. Some thought reveals that the optimum is among these (remember the assumption). In Python:
sum = 0
min_so_far = 0
max_so_far = 0
is_first = [True] * (1 + len(arr))
for i, x in enumerate(arr):
sum += x
if sum < min_so_far:
min_so_far = sum
elif sum > max_so_far:
max_so_far = sum
else:
is_first[1 + i] = False
sum_i = 0
i = 0
while sum_i != sum:
sum_i += arr[i]
i += 1
sum_j = sum
j = len(arr)
longest = j - i
for h in xrange(sum - 1, -1, -1):
while sum_i != h or not is_first[i]:
i -= 1
sum_i -= arr[i]
while sum_j != h:
j -= 1
sum_j -= arr[j]
longest = max(longest, j - i)
The trick to get the space down comes from noticing that we're scanning is_first
sequentially, albeit in reverse order relative to its construction. Since the loop variables fit in O(log n) bits, we'll compute, instead of is_first
, a checkpoint of the loop variables after each O(√(n log n)) steps. This is O(n/√(n log n)) = O(√(n/log n)) checkpoints, for a total of O(√(n log n)) bits. By restarting the loop from a checkpoint, we compute on demand each O(√(n log n))-bit section of is_first
.
(P.S.: it may or may not be my fault that the problem statement asks for O(1) space. I sincerely apologize if it was I who pulled a Fermat and suggested that I had a solution to a problem much harder than I thought it was.)
I have this algorithm running in O(n) time and O(1) space.
public static void longestSubArrayWithSameZerosAndOnes() {
// You are given an array of 1's and 0's only.
// Find the longest subarray which contains equal number of 1's and 0's
int[] A = new int[] {1, 0, 1, 1, 1, 0, 0,0,1};
int num0 = 0, num1 = 0;
// First, calculate how many 0s and 1s in the array
for(int i = 0; i < A.length; i++) {
if(A[i] == 0) {
num0++;
}
else {
num1++;
}
}
if(num0 == 0 || num1 == 0) {
System.out.println("The length of the sub-array is 0");
return;
}
// Second, check the array to find a continuous "block" that has
// the same number of 0s and 1s, starting from the HEAD and the
// TAIL of the array, and moving the 2 "pointer" (HEAD and TAIL)
// towards the CENTER of the array
int start = 0, end = A.length - 1;
while(num0 != num1 && start < end) {
if(num1 > num0) {
if(A[start] == 1) {
num1--; start++;
}
else if(A[end] == 1) {
num1--; end--;
}
else {
num0--; start++;
num0--; end--;
}
}
else if(num1 < num0) {
if(A[start] == 0) {
num0--; start++;
}
else if(A[end] == 0) {
num0--; end--;
}
else {
num1--; start++;
num1--; end--;
}
}
}
if(num0 == 0 || num1 == 0) {
start = end;
end++;
}
// Third, expand the continuous "block" just found at step #2 by
// moving "HEAD" to head of the array and "TAIL" to the end of
// the array, while still keeping the "block" balanced(containing
// the same number of 0s and 1s
while(0 < start && end < A.length - 1) {
if(A[start - 1] == 0 && A[end + 1] == 0 || A[start - 1] == 1 && A[end + 1] == 1) {
break;
}
start--;
end++;
}
System.out.println("The length of the sub-array is " + (end - start + 1) + ", starting from #" + start + " to #" + end);
I would argue that it is impossible, that an algorithm with O(1) exists, in the following way. Assume you iterate ONCE over every bit. This requires a counter which needs the space of O(log n). Possibly one could argue that n itself is part of the problem instance, then you have as input length for a binary string of the length k: k + 2-log k. Regardless how you look over them you need an additional variable, on case you need an index into that array, that already makes it non O(1).
Usually you dont have this problem, because you have for an problem of the size n, an input of n numbers of the size log k, which adds up to nlog k. Here a variable of length log k is just O(1). But here our log k is just 1. So we can only introduce a help variable that has constant length (and I mean really constant, it must be limited regardless how big the n is).
Here one problem is the description of the problem comes visible. In computer theory you have to be very careful about your encoding. E.g. you can make NP problems polynomial if you switch to unary encoding (because then input size is exponential bigger than in a n-ary (n>1) encoding.
As for n the input has just the size 2-log n, one must be careful. When you speak in this case of O(n) - this is really an algorithm that is O(2^n) (This is no point we need to discuss about - because one can argue whether the n itself is part of the description or not).