Given an array of integers find the number of all ordered pairs of elements in the array whose sum lies in a given range [a,b]
Here is an O(n^2) solution for the same <
I believe this is a simple math problem, that could be solved with numpy
with no loops and no sorting on our part. I'm not exactly sure, but I believe the complexity to be O(N^2) at worse case (would love some confirmation on that by someone more knowledgeable with time complexities in numpy).
At any rate, here's my solution:
import numpy as np
def count_pairs(input_array, min, max):
A = np.array(input_array)
A_ones = np.ones((len(A),len(A)))
A_matrix = A*A_ones
result = np.transpose(A_matrix) + A_matrix
result = np.triu(result,0)
np.fill_diagonal(result,0)
count = ((result > min) & (result < max)).sum()
return count
Now let's walk through it - first I just create a matrix with columns representing our numbers:
A = np.array(input_array)
A_ones = np.ones((len(A),len(A)))
A_matrix = A*A_ones
Let's assume that our input array looked like: [1,1,2,2,3,-1]
,thus, this should be the value of A_matrix
at this point.
[[ 1. 1. 2. 2. 3. -1.]
[ 1. 1. 2. 2. 3. -1.]
[ 1. 1. 2. 2. 3. -1.]
[ 1. 1. 2. 2. 3. -1.]
[ 1. 1. 2. 2. 3. -1.]
[ 1. 1. 2. 2. 3. -1.]]
If I add that to the transpose of itself...
result = np.transpose(A_matrix) + A_matrix
...I should get a matrix representing all combinations of sums of pairs:
[[ 2. 2. 3. 3. 4. 0.]
[ 2. 2. 3. 3. 4. 0.]
[ 3. 3. 4. 4. 5. 1.]
[ 3. 3. 4. 4. 5. 1.]
[ 4. 4. 5. 5. 6. 2.]
[ 0. 0. 1. 1. 2. -2.]]
Of course, this matrix is mirrored across the diagonal because the pairs (1,2) and (2,1) produce the same result. We don't want to consider these duplicate entries. We also don't want to consider the sum of an item with itself, so let's sanitize our array:
result = np.triu(result,0)
np.fill_diagonal(result,0)
Our result now looks like:
[[ 0. 2. 3. 3. 4. 0.]
[ 0. 0. 3. 3. 4. 0.]
[ 0. 0. 0. 4. 5. 1.]
[ 0. 0. 0. 0. 5. 1.]
[ 0. 0. 0. 0. 0. 2.]
[ 0. 0. 0. 0. 0. 0.]]
All that remains is to count the items that pass our criteria.
count = ((result > min) & (result < max)).sum()
This method won't work if 0
is in the acceptable domain, but I'm sure it would be trivial to manipulate that result matrix above to convert those 0's to some other meaningless number....
Sort the array first and count the pairs by two indexes. The two indexes approach is similar to the one in 2-sum problem, which avoids the binary-search for N
times. The time consuming of the algorithm is Sort Complexity + O(N)
, typically, sort is O(NlnN), thus this approach is O(NlnN). The idea of the algorithm is, for an index i
, find an lower bound and an upper bound such that a <= arr[i]+arr[low] <= arr[i]+arr[high] <= b
and when i
increases, what we should do is to decrease low
and high
to hold the condition. To avoid counting the same pair twice, we keep low > i
, also we keep low <= high
. The complexity of the following counting approach is O(N), because, in the while loop
, what we can do is ++i
or --low
or --high
and there are at most N
such operations.
//count pair whose sum is in [a, b]
//arr is a sorted array with size integers.
int countPair(int arr[], int size, int a, int b) {
int cnt = 0;
int i = 0, low = size-1, high = size-1;
while (i < high) {
//find the lower bound such that arr[i] + arr[low] < a,
//meanwhile arr[i]+arr[low+1] >= a
low = max(i, low);
while (low > i && arr[i] + arr[low] >= a) --low;
//find an upper bound such that arr[i] + arr[high] <= b
//meanwhile, arr[i]+arr[high+1] > b
while (high > low && arr[i] + arr[high] > b) --high;
//all pairs: arr[i]+arr[low+1], arr[i]+arr[low+2],...,arr[i]+arr[high]
//are in the rage[a, b], and we count it as follows.
cnt += (high-low);
++i;
}
return cnt;
}