I want to remove some entries from a numpy array that is about a million entries long.
This code would do it but take a long time:
a = np.array([1,45
I'm assuming you mean a < -100 or a > -100
, the most concise way is to use logical indexing.
a = a[(a >= -100) & (a <= 100)]
This is not exactly "deleting" the entries, rather making a copy of the array minus the unwanted values and assigning it to the variable that was previously assigned to the old array. After this happens the old array has no remaining references and is garbage collected, meaning its memory is freed.
It's worth noting that this method does not use constant memory, since we make a copy of the array it uses memory linear in the size of the array. This could be bad if your array is so huge it reaches the limits of the memory on your machine. The process of actually going through and removing each element in the array "in place", aka using constant memory, would be a very different operation, as elements in the array would need to be swapped around and the block of memory resized. I'm not sure you can do this with a numpy
array, however one thing you can do to avoid copying is to use a numpy
masked array:
import numpy.ma as ma
mx = ma.masked_array(a, mask = ((a < -100) | (a > 100)) )
All operations on the masked array will act as if the elements we "deleted" don't exist, but we didn't really "delete" them, they are still there in memory, there is just a record of which elements to skip now associated with the array, and we don't ever need to make a copy of the array in memory. Also if we ever want our deleted values back, we can just remove the mask like so:
mx.mask = ma.nomask
In [140]: a = np.array([1,45,23,23,1234,3432,-1232,-34,233])
In [141]: b=a[(-100<=a)&(a<=100)]
In [142]: b
Out[142]: array([ 1, 45, 23, 23, -34])
You can use masked index with inversed condition.
>>> a = np.array([1,45,23,23,1234,3432,-1232,-34,233])
>>> a[~((a < -100) | (a > 100))]
array([ 1, 45, 23, 23, -34])
>>> a[(a >= -100) & (a <= 100)]
array([ 1, 45, 23, 23, -34])
>>> a[abs(a) <= 100]
array([ 1, 45, 23, 23, -34])