Design an algorithm to find all pairs of integers within an array which sum to a specified value.
I have tried this problem using a hash
I don't see why the hash table approach is inefficient, at least in algorithm analysis terms - in memory locality terms admittedly, it can be quite bad. Anyway, scan the array twice...
First scan - put all the array elements in the hash table - O(n) total. Individual inserts are only amortized O(1), but a neat thing about how amortized analysis works means the O(n) is absolute - not amortized.
Second scan - check for (sum - current) in the hash table - O(n) total.
This beats the O(n log n) sort-and-search methods, at least in theory.
Then, note that you can combine the two scans into one. You can spot a pair as soon as you encounter the second of that pair during the first scan. In pseudocode...
for i in array.range
hashset.insert (array [i])
diff = sum - array [i]
if hashset.includes (diff)
output diff, array [i]
If you need positions of the items, use a hashmap and store item positions in it. If you need to cope with duplicates, you might need to store counts in a hashmap. For positions and duplicates, you might need a hashmap of start pointers for linked lists of positions.
This makes assumptions about the hash table implementation, but fairly safe ones given the usual implementations in most current languages and libraries.
BTW - combining the scans shouldn't be seen as an optimisation. The iteration overhead should be insignificant. Memory locality issues could make a single pass slightly more efficient for very large arrays, but the real memory locality issues will be in the hashtable lookups anyway.
IMO the only real reason to combine the scans is because you only want each pair reported once - handling that in a two-scan approach would be a bit more hassle.
If the array is sorted:
Let i = 0, j = end of array, sum = the value you are looking for, then do:
If i+j = sum, then output (i,j).
If i+j < sum, then move i to the right one position.
If i+j > sum, then move j to the left one position.
Time complexity: O(n). Space complexity: O(1).
If the array is not sorted, there are a few ways to approach this problem:
Sort the array and then use the above approach.
HashMap:
Store all elements in a HashMap.
a+b=sum
, so b=sum-a
. For each element a
of the array, look up b
from the HashMap.
HashMap lookup takes amortized O(1).
Time complexity: O(n). Space complexity: O(n).
BitMap:
Iterate through the input to create a bitmap where each bit corresponds to an element value. Say the input is {2,5,8}
, then we toggle the bitmap array's indices 2, 5 and 8 from binary 0 to 1. This takes O(1) per element, thus O(n) in total.
Go through the input again. We know b=sum-a
, so for every element a
in the input, look up its b
, which can be done in O(1)
since it's a bitmap index. This also takes O(n) in total.
Time complexity: O(n) + O(n) = O(n). Space complexity: bitmap space = O(n).
Here is a solution witch takes into account duplicate entries. It is written in javascript and assumes array is sorted. The solution runs in O(n) time and does not use any extra memory aside from variable.
var count_pairs = function(_arr,x) {
if(!x) x = 0;
var pairs = 0;
var i = 0;
var k = _arr.length-1;
if((k+1)<2) return pairs;
var halfX = x/2;
while(i<k) {
var curK = _arr[k];
var curI = _arr[i];
var pairsThisLoop = 0;
if(curK+curI==x) {
// if midpoint and equal find combinations
if(curK==curI) {
var comb = 1;
while(--k>=i) pairs+=(comb++);
break;
}
// count pair and k duplicates
pairsThisLoop++;
while(_arr[--k]==curK) pairsThisLoop++;
// add k side pairs to running total for every i side pair found
pairs+=pairsThisLoop;
while(_arr[++i]==curI) pairs+=pairsThisLoop;
} else {
// if we are at a mid point
if(curK==curI) break;
var distK = Math.abs(halfX-curK);
var distI = Math.abs(halfX-curI);
if(distI > distK) while(_arr[++i]==curI);
else while(_arr[--k]==curK);
}
}
return pairs;
}
So here it is for everyone.
Start at both side of the array and slowly work your way inwards making sure to count duplicates if they exist.
It only counts pairs but can be reworked to
Enjoy and don't forget to bump it if its the best answer!!
A solution that takes into account duplicates and uses every number only one time:
void printPairs(int[] numbers, int S) {
// toMap(numbers) converts the numbers array to a map, where
// Key is a number from the original array
// Value is a count of occurrences of this number in the array
Map<Integer, Integer> numbersMap = toMap(numbers);
for (Entry<Integer, Integer> entry : numbersMap.entrySet()) {
if (entry.getValue().equals(0)) {
continue;
}
int number = entry.getKey();
int complement = S - number;
if (numbersMap.containsKey(complement) && numbersMap.get(complement) > 0) {
for (int j = 0; j < min(numbersMap.get(number),
numbersMap.get(complement)); j++) {
if (number.equals(complement) && numbersMap.get(number) < 2) {
break;
}
System.out.println(number, complement);
numbersMap.put(number, numbersMap.get(number) - 1);
numbersMap.put(complement, numbersMap.get(complement) - 1);
}
}
}
}
You don't even need to store all the elements in hashmap, and then scan. You can scan during the first iteration itself.
void foo(int[] A, int sum) {
HashSet<Integer> set = new HashSet<Integer>();
for (int e : A) {
if (set.contains(sum-e)) {
System.out.println(e + "," + (sum-e));
// deal with the duplicated case
set.remove(sum-e);
} else {
set.add(e);
}
}
}
If you don't mind spending O(M)
in space, where M
is the sum you are seeking, you can do this in O(N + M)
time. Set sums[i] = 1
when i <= M
on a single pass over N
, then check (sums[i] && sums[M-i])
on a single pass over M/2
.