I want to find the unique elements of an array in a certain range of tolerance
For instance, for an array/list
[1.1 , 1.3 , 1.9 , 2.0 , 2.5 , 2.9]
In pure Python 2, I wrote the following:
a = [1.1, 1.3, 1.9, 2.0, 2.5, 2.9]
# Per http://fr.mathworks.com/help/matlab/ref/uniquetol.html
tol = max(map(lambda x: abs(x), a)) * 0.3
a.sort()
results = [a.pop(0), ]
for i in a:
# Skip items within tolerance.
if abs(results[-1] - i) <= tol:
continue
results.append(i)
print a
print results
Which results in
[1.3, 1.9, 2.0, 2.5, 2.9]
[1.1, 2.0, 2.9]
Which is what the spec seems to agree with, but isn't consistent with your example.
If I just set the tolerance to 0.3
instead of max(map(lambda x: abs(x), a)) * 0.3
, I get:
[1.3, 1.9, 2.0, 2.5, 2.9]
[1.1, 1.9, 2.5, 2.9]
...which is consistent with your example.
With A
as the input array and tol
as the tolerance value, we could have a vectorized approach with NumPy broadcasting, like so -
A[~(np.triu(np.abs(A[:,None] - A) <= tol,1)).any(0)]
Sample run -
In [20]: A = np.array([2.1, 1.3 , 1.9 , 1.1 , 2.0 , 2.5 , 2.9])
In [21]: tol = 0.3
In [22]: A[~(np.triu(np.abs(A[:,None] - A) <= tol,1)).any(0)]
Out[22]: array([ 2.1, 1.3, 2.5, 2.9])
Notice 1.9
being gone because we had 2.1
within the tolerance of 0.3
. Then, 1.1
gone for 1.3
and 2.0
for 2.1
.
Please note that this would create a unique array with "chained-closeness" check. As an example :
In [91]: A = np.array([ 1.1, 1.3, 1.5, 2. , 2.1, 2.2, 2.35, 2.5, 2.9])
In [92]: A[~(np.triu(np.abs(A[:,None] - A) <= tol,1)).any(0)]
Out[92]: array([ 1.1, 2. , 2.9])
Thus, 1.3
is gone because of 1.1
and 1.5
is gone because of 1.3
.