问题
How to return a set of rows of a NumPy Matrix that would match a given condition?
This is a Numpy Matrix object
>>> X
matrix([['sunny', 'hot', 'high', 'FALSE'],
['sunny', 'hot', 'high', 'TRUE'],
['overcast', 'hot', 'high', 'FALSE'],
['rainy', 'mild', 'high', 'FALSE'],
['rainy', 'cool', 'normal', 'FALSE'],
['rainy', 'cool', 'normal', 'TRUE'],
['overcast', 'cool', 'normal', 'TRUE'],
['sunny', 'mild', 'high', 'FALSE'],
['sunny', 'cool', 'normal', 'FALSE'],
['rainy', 'mild', 'normal', 'FALSE'],
['sunny', 'mild', 'normal', 'TRUE'],
['overcast', 'mild', 'high', 'TRUE'],
['overcast', 'hot', 'normal', 'FALSE'],
['rainy', 'mild', 'high', 'TRUE']],
dtype='|S8')
I would like to get the set of all rows that has the first column value as 'rainy'
so it tried this
>>> X[X[:,0]=='rainy']
matrix([['rainy', 'rainy', 'rainy', 'rainy', 'rainy']],
dtype='|S8')
But I wanted an output like this
matrix([['rainy', 'mild', 'high', 'FALSE'],
['rainy', 'cool', 'normal', 'FALSE'],
['rainy', 'cool', 'normal', 'TRUE'],
['rainy', 'mild', 'normal', 'FALSE'],
['rainy', 'mild', 'high', 'TRUE']],
dtype='|S8')
How should this be done?
回答1:
>>> X[(X[:, 0] == 'rainy').ravel(), :]
matrix([['rainy', 'mild', 'high', 'FALSE'],
['rainy', 'cool', 'normal', 'FALSE'],
['rainy', 'cool', 'normal', 'TRUE'],
['rainy', 'mild', 'normal', 'FALSE'],
['rainy', 'mild', 'high', 'TRUE']],
dtype='|S8')
If you look at the result of your comparison:
>>> X[:, 0] == 'rainy'
array([[False],
[False],
[False],
[ True],
[ True],
[ True],
[False],
[False],
[False],
[ True],
[False],
[False],
[False],
[ True]], dtype=bool)
This needs to be flattened into a vector using ravel:
(X[:, 0] == 'rainy').ravel()
array([False, False, False, True, True, True, False, False, False,
True, False, False, False, True], dtype=bool)
For additional constraints, this works:
X[(X[:, 0] == 'rainy').ravel() & (X[:, 1] == 'cool').ravel(), :]
matrix([['rainy', 'cool', 'normal', 'FALSE'],
['rainy', 'cool', 'normal', 'TRUE']],
dtype='|S8')
回答2:
There are more than one way of doing it.
foo = np.where(X[:, 0] == 'rainy') # get the index
X[foo, :] # The result you want.
来源:https://stackoverflow.com/questions/36206287/how-to-get-a-subset-of-rows-from-a-numpy-matrix-based-on-a-condition