Numpy: For every element in one array, find the index in another array

后端未结

关注

 8  962

旧巷少年郎

I have two 1D arrays, x & y, one smaller than the other. I\'m trying to find the index of every element of y in x.

I\'ve found two naive ways to do this, the fir

相关标签:

8条回答

暗喜

2020-12-02 18:42

How about this?

It does assume that every element of y is in x, (and will return results even for elements that aren't!) but it is much faster.

import numpy as np

# Generate some example data...
x = np.arange(1000)
np.random.shuffle(x)
y = np.arange(100)

# Actually preform the operation...
xsorted = np.argsort(x)
ypos = np.searchsorted(x[xsorted], y)
indices = xsorted[ypos]

0 讨论(0)

小鲜肉

2020-12-02 18:42

Use this line of code :-

indices = np.where(y[:, None] == x[None, :])[1]

0 讨论(0)
发布评论:

提交评论
- 加载中...
天涯浪人

2020-12-02 18:45
I would just do this:
```
indices = np.where(y[:, None] == x[None, :])[1]
```
Unlike your memory-hog way, this makes use of broadcast to directly generate 2D boolean array without creating 2D arrays for both x and y.
0 讨论(0)
发布评论:

提交评论
- 加载中...
夕颜

2020-12-02 18:46
The numpy_indexed package (disclaimer: I am its author) contains a function that does exactly this:
```
import numpy_indexed as npi
indices = npi.indices(x, y, missing='mask')
```
It will currently raise a KeyError if not all elements in y are present in x; but perhaps I should add a kwarg so that one can elect to mark such items with a -1 or something.

It should have the same efficiency as the currently accepted answer, since the implementation is along similar lines. numpy_indexed is however more flexible, and also allows to search for indices of rows of multidimensional arrays, for instance.

EDIT: ive changed the handling of missing values; the 'missing' kwarg can now be set with 'raise', 'ignore' or 'mask'. In the latter case you get a masked array of the same length of y, on which you can call .compressed() to get the valid indices. Note that there is also npi.contains(x, y) if this is all you need to know.
0 讨论(0)
发布评论:

提交评论
- 加载中...

挽巷

2020-12-02 18:51

As Joe Kington said, searchsorted() can search element very quickly. To deal with elements that are not in x, you can check the searched result with original y, and create a masked array:

import numpy as np
x = np.array([3,5,7,1,9,8,6,6])
y = np.array([2,1,5,10,100,6])

index = np.argsort(x)
sorted_x = x[index]
sorted_index = np.searchsorted(sorted_x, y)

yindex = np.take(index, sorted_index, mode="clip")
mask = x[yindex] != y

result = np.ma.array(yindex, mask=mask)
print result

the result is:

[-- 3 1 -- -- 6]

0 讨论(0)

萌比男神i

2020-12-02 18:53
I want to suggest one-line solution:
```
indices = np.where(np.in1d(x, y))[0]
```
The result is an array with indices for x array which corresponds to elements from y which were found in x.

One can use it without numpy.where if needs.
0 讨论(0)
发布评论:

提交评论
- 加载中...

1 2 下一页