问题
I have two sets of coordinates and want to find out which coordinates of the coo
set are identical to any coordinate in the targets
set. I want to know the indices in the coo
set which means I'd like to get a list of indices or of bools.
import numpy as np
coo = np.array([[1,2],[1,6],[5,3],[3,6]]) # coordinates
targets = np.array([[5,3],[1,6]]) # coordinates of targets
print(np.isin(coo,targets))
[[ True False]
[ True True]
[ True True]
[ True True]]
The desired result would be one of the following two:
[False True True False] # bool list
[1,2] # list of concerning indices
My problem is, that ...
np.isin
has noaxis
-attribute so that I could useaxis=1
.- even applying logical and to each row of the output would return
True
for the last element, which is wrong.
I am aware of loops and conditions but I am sure Python is equipped with ways for a more elegant solution.
回答1:
This solution will scale worse for large arrays, for such cases the other proposed answers will perform better.
Here's one way taking advantage of broadcasting:
(coo[:,None] == targets).all(2).any(1)
# array([False, True, True, False])
Details
Check for every row in coo
whether or not it matches another in target
by direct comparisson having added a first axis to coo
so it becomes broadcastable against targets
:
(coo[:,None] == targets)
array([[[False, False],
[ True, False]],
[[False, False],
[ True, True]],
[[ True, True],
[False, False]],
[[False, False],
[False, True]]])
Then check which ndarrays
along the second axis have all values to True
:
(coo[:,None] == targets).all(2)
array([[False, False],
[False, True],
[ True, False],
[False, False]])
And finally use any to check which rows have at least one True
.
回答2:
The numpy_indexed package implements functionality of this type in a vectorized manner (disclaimer: I am its author). Sadly numpy lacks a lot of this functionality out of the box; I started numpy_indexed with the intention of having it merged into numpy, but there are some backwards compatibility concerns, and big packages like that tend to move slowly. So that hasnt happened in the last 3 years; but the python packaging ecosystem works so well nowadays that just adding one more package to your environment is just as simple, really.
import numpy_indexed as npi
bools = npi.in_(targets, coo)
This will have a time-complexity similar to that of the solution posted by @fountainhead (logarithmic rather than linear, as per the currently accepted answer), but also the npi library will give you the safety of automated tests, and a lot of other convenient options, should you decide to approach the problem from a slightly different angle.
回答3:
Here is a simple and intuitive solution that actually uses numpy.isin()
, to match tuples, rather than match individual numbers:
# View as a 1d array of tuples
coo_view = coo.view(dtype='i,i').reshape((-1,))
targets_view = targets.view(dtype='i,i').reshape((-1,))
result = np.isin(coo_view, targets_view)
print (result)
print(result.nonzero()[0])
Output:
[False True True False]
[1 2]
Notes:
- The creation of these views does not involve any copying of data.
- The
dtype='i,i'
specifies that we want each element of the view to be a tuple of two integers
来源:https://stackoverflow.com/questions/54828039/how-to-match-pairs-of-values-contained-in-two-numpy-arrays