So, lets say I have 100,000 float arrays with 100 elements each. I need the highest X number of values, BUT only if they are greater than Y. Any element not matching this
The simplest way would be:
topX = sorted([x for x in array if x > lowValY], reverse=True)[highCountX-1]
print [x if x >= topX else 0 for x in array]
In pieces, this selects all the elements greater than lowValY
:
[x for x in array if x > lowValY]
This array only contains the number of elements greater than the threshold. Then, sorting it so the largest values are at the start:
sorted(..., reverse=True)
Then a list index takes the threshold for the top highCountX
elements:
sorted(...)[highCountX-1]
Finally, the original array is filled out using another list comprehension:
[x if x >= topX else 0 for x in array]
There is a boundary condition where there are two or more equal elements that (in your example) are 3rd highest elements. The resulting array will contain that element more than once.
There are other boundary conditions as well, such as if len(array) < highCountX
. Handling such conditions is left to the implementor.