How to get a subset of rows from a NumPy Matrix based on a condition?

夙愿已清 提交于 2021-02-11 06:35:33

问题


How to return a set of rows of a NumPy Matrix that would match a given condition?

This is a Numpy Matrix object

>>> X

matrix([['sunny', 'hot', 'high', 'FALSE'],
        ['sunny', 'hot', 'high', 'TRUE'],
        ['overcast', 'hot', 'high', 'FALSE'],
        ['rainy', 'mild', 'high', 'FALSE'],
        ['rainy', 'cool', 'normal', 'FALSE'],
        ['rainy', 'cool', 'normal', 'TRUE'],
        ['overcast', 'cool', 'normal', 'TRUE'],
        ['sunny', 'mild', 'high', 'FALSE'],
        ['sunny', 'cool', 'normal', 'FALSE'],
        ['rainy', 'mild', 'normal', 'FALSE'],
        ['sunny', 'mild', 'normal', 'TRUE'],
        ['overcast', 'mild', 'high', 'TRUE'],
        ['overcast', 'hot', 'normal', 'FALSE'],
        ['rainy', 'mild', 'high', 'TRUE']], 
       dtype='|S8')

I would like to get the set of all rows that has the first column value as 'rainy' so it tried this

>>> X[X[:,0]=='rainy']

matrix([['rainy', 'rainy', 'rainy', 'rainy', 'rainy']], 
       dtype='|S8')

But I wanted an output like this

matrix([['rainy', 'mild', 'high', 'FALSE'],
        ['rainy', 'cool', 'normal', 'FALSE'],
        ['rainy', 'cool', 'normal', 'TRUE'],
        ['rainy', 'mild', 'normal', 'FALSE'],
        ['rainy', 'mild', 'high', 'TRUE']], 
       dtype='|S8')

How should this be done?


回答1:


>>> X[(X[:, 0] == 'rainy').ravel(), :]
matrix([['rainy', 'mild', 'high', 'FALSE'],
        ['rainy', 'cool', 'normal', 'FALSE'],
        ['rainy', 'cool', 'normal', 'TRUE'],
        ['rainy', 'mild', 'normal', 'FALSE'],
        ['rainy', 'mild', 'high', 'TRUE']], 
       dtype='|S8')

If you look at the result of your comparison:

>>> X[:, 0] == 'rainy'
array([[False],
       [False],
       [False],
       [ True],
       [ True],
       [ True],
       [False],
       [False],
       [False],
       [ True],
       [False],
       [False],
       [False],
       [ True]], dtype=bool)

This needs to be flattened into a vector using ravel:

(X[:, 0] == 'rainy').ravel()
array([False, False, False,  True,  True,  True, False, False, False,
        True, False, False, False,  True], dtype=bool)

For additional constraints, this works:

X[(X[:, 0] == 'rainy').ravel() & (X[:, 1] == 'cool').ravel(), :]
matrix([['rainy', 'cool', 'normal', 'FALSE'],
        ['rainy', 'cool', 'normal', 'TRUE']], 
       dtype='|S8')



回答2:


There are more than one way of doing it.

foo = np.where(X[:, 0] == 'rainy') # get the index
X[foo, :]                          # The result you want.


来源:https://stackoverflow.com/questions/36206287/how-to-get-a-subset-of-rows-from-a-numpy-matrix-based-on-a-condition

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!