I want to remove rows from a ndarray based on another array. for example:
k = [1,3,99]
n = [
[1,\'a\']
[2,\'b\']
[3,\'c\']
[4,\'c\']
[.....]
[99, \'
You can use np.in1d to create a mask of matches between the first column of n
and k
and then use the inverted mask to select the non-matching rows off n
, like so -
n[~np.in1d(n[:,0].astype(int), k)]
If the first column is already of int
dtype, skip the .astype(int)
conversion step.
Sample run -
In [41]: n
Out[41]:
array([['1', 'a'],
['2', 'b'],
['3', 'c'],
['4', 'c'],
['99', 'a'],
['100', 'e']],
dtype='|S21')
In [42]: k
Out[42]: [1, 3, 99]
In [43]: n[~np.in1d(n[:,0].astype(int), k)]
Out[43]:
array([['2', 'b'],
['4', 'c'],
['100', 'e']],
dtype='|S21')
For peformance, if the first column is sorted, we can use np.searchsorted -
mask = np.ones(n.shape[0],dtype=bool)
mask[np.searchsorted(n[:,0], k)] = 0
out = n[mask]