I had an interview the other day with Amazon, and a question they asked me was pertaining to the following problem.
Given 2 integer arrays, containing any number of elem
One possible answer is similar to the HashMap
solution... IF you know that the integers are within a very small window. It would be similar to this: http://en.wikipedia.org/wiki/Bucket_sort
Basically, if the integers are guaranteed to be within a certain constant size window (i.e. all of them are 1-1000) then you can do it in constant space by incrementing each cell of index = whatever your number is. This is exactly the same as the HashMap
solution, except you don't need to be able to account for all possible integers like a HashMap
can, which lets you save on space. If this is unclear let me know in the comments and I'll explain further.
I believe this is possible to do in-place with O(1)
extra space. I make use of the additional assumption that the elements in the arrays are mutable as well as swappable, but I believe with careful accounting the mutability assumption could be removed for this particular problem.
The basic idea is to do in-place hashing. In-place hashing may be implemented by partitioning the array around a suitable percentile, say the 90th, using the O(n)
median-of-medians selection algorithm. This divides the array into a small portion (about 10%) and a large portion (about 90%) whose elements are distinguishable from each other (less than the partition element or not). You can then hash from the 10% portion into the 90% portion by swapping. This hashing can be used to detect duplicates. This is O(n)
for each processing of 10% of the array, so done 10 times is still O(n)
. I described this in much more detail, though with some hand-waving I mean to correct one day, over at this related question..
For this particular problem, you need to do in-place hashing 3 times. First on each individual array to remove duplicates. Then, on a wrapper representing the combined arrays (if index is less than length of array 1, index into array 1, else index into array 2) to report duplicates.
There is no O(1) space method for finding the intersection of two unsorted sets in O(n) time.
For a data type with an unlimited range, the minimum sorting price is O(n ln n).
For a data type with a limited range radix sort provides the ability to do an in-place radix sort in O(n ln n' n") time, where n is the size of the data, n' is the number of values that can be represented, and n" has to do with the cost of checking whether two values are in the same radix group. The n" time price can be dropped in return for an O(ln n) space price.
In the special case of 32-bit integers, n' is 2^32 and n" is 1, so this would collapse to O(n) and provide a winning solution for multi-billion record sets.
For integers of unlimited size, n' and n" preclude an O(n) time solution via radix.
The key is to sort the two arrays in-place. I did a search for "in-place radix sort", and found In-Place Radix Sort. I believe the problem is solvable, at least for Java int[], by applying those ideas to sort each array, bit by bit, then doing the obvious scan.
Incidentally, I think the correct output for the problem in the question code is 1, 2, 3.
Here is my implementation, based on the answers to the referenced question:
public class ArrayMatch {
public static void main(String[] args) {
int[] a = { 4, 1, 2, 3, 4 };
int[] b = { 2, 5, 6, 7, 3, 2, 2, 2, 2, 1, 2, 2, 2, 2 };
System.out.print("Original problem");
printMatches(a, b);
System.out.println();
int[] a1 = { 4, 1, -1234, 2, 3, 4, Integer.MIN_VALUE };
int[] b1 = { -1234, 2, 5, 6, 7, 3, 2, 2, 2, 2, 1, 2, 2, 2, 2 , Integer.MIN_VALUE, Integer.MAX_VALUE};
System.out.print("With negatives");
printMatches(a1, b1);
System.out.println();
}
// Print all matching elements between the two arrays.
private static void printMatches(int[] a, int[] b) {
if (a.length == 0 || b.length == 0) {
return;
}
sort(a);
sort(b);
int i = 0;
int j = 0;
while (true) {
while (a[i] < b[j]) {
i++;
if (i == a.length) {
return;
}
}
while (a[i] > b[j]) {
j++;
if (j == b.length) {
return;
}
}
if (a[i] == b[j]) {
System.out.print(" " + a[i]);
do {
i++;
} while (i < a.length && a[i - 1] == a[i]);
do {
j++;
} while (j < b.length && b[j - 1] == b[j]);
}
if (i == a.length || j == b.length) {
return;
}
}
}
// In place radix sort.
private static void sort(int[] in) {
// Flip the sign bit to regularize the sort order
flipBit(in, 31);
sort(in, 0, in.length, 31);
// Flip back the sign bit back to restore 2's complement
flipBit(in, 31);
}
/**
* Sort a subarray, elements start through end-1 of in, according to the
* values in firstBit through 0.
*
* @param in
* @param start
* @param end
* @param firstBit
*/
private static void sort(int[] in, int start, int end, int firstBit) {
if (start == end) {
return;
}
int mask = 1 << firstBit;
int zeroCount = 0;
for (int i = start; i < end; i++) {
if ((in[i] & mask) == 0) {
zeroCount++;
}
}
int elements = end - start;
int nextZeroIndex = start;
int nextOneIndex = start + zeroCount;
int split = nextOneIndex;
if (zeroCount > 0 && zeroCount < elements) {
while (nextZeroIndex < split) {
if ((in[nextZeroIndex] & mask) != 0) {
// Found a one bit in the zero area, look for its partner in the one
// area
while ((in[nextOneIndex] & mask) != 0) {
nextOneIndex++;
}
int temp = in[nextZeroIndex];
in[nextZeroIndex] = in[nextOneIndex];
in[nextOneIndex] = temp;
nextOneIndex++;
}
nextZeroIndex++;
}
}
if (firstBit > 0) {
sort(in, start, split, firstBit - 1);
sort(in, split, end, firstBit - 1);
}
}
private static void flipBit(int[] in, int bitNo) {
int mask = 1 << bitNo;
for (int i = 0; i < in.length; i++) {
in[i] ^= mask;
}
}
}