Python numpy filter two-dimensional array by condition

前端 未结 4 1704
半阙折子戏
半阙折子戏 2020-12-19 03:26

Python newbie here, I have read Filter rows of a numpy array? and the doc but still can\'t figure out how to code it the python way.

Example array I have: (the real

相关标签:
4条回答
  • 2020-12-19 04:04

    In this case where the len(filter) is sufficiently smaller than a[:,1], np.in1d does an iterative version of

    mask = (a[:,1,None] == filter[None,:]).any(axis=1)
    a[mask,:]
    

    It does (adapting the in1d code):

    In [1301]: arr1=a[:,1];arr2=np.array(filter)
    In [1302]: mask=np.zeros(len(arr1),dtype=np.bool)
    In [1303]: for i in arr2:
          ...:     mask |= (arr1==i)
    In [1304]: mask
    Out[1304]: array([ True, False,  True, False], dtype=bool)
    

    With more items in filter is would build its search around unique, concatenate and argsort, looking for duplicates.

    So it's convenience hides a fair amount of complexity.

    0 讨论(0)
  • 2020-12-19 04:15

    Try this:

    >>> a[numpy.in1d(a[:,1], filter)]
    array([['2', 'a'],
           ['4', 'c']], 
          dtype='|S21')
    

    Also go through http://docs.scipy.org/doc/numpy/reference/generated/numpy.in1d.html

    0 讨论(0)
  • 2020-12-19 04:20

    A somewhat elaborate pure numpy vectorized solution:

    >>> import numpy
    >>> a = numpy.asarray([[2,'a'],[3,'b'],[4,'c'],[5,'d']])
    >>> filter = numpy.array(['a','c'])
    >>> a[(a[:,1,None] == filter[None,:]).any(axis=1)]
    array([['2', 'a'],
           ['4', 'c']], 
          dtype='|S21')
    

    None in the index creates a singleton dimension, therefore we can compare the column of a and the row of filter, and then reduce the resulting boolean array

    >>> a[:,1,None] == filter[None,:]
    array([[ True, False],
           [False, False],
           [False,  True],
           [False, False]], dtype=bool)
    

    over the second dimension with any.

    0 讨论(0)
  • 2020-12-19 04:28

    You can use a bool index array that you can produce using np.in1d.

    You can index a np.ndarray along any axis you want using for example an array of bools indicating whether an element should be included. Since you want to index along axis=0, meaning you want to choose from the outest index, you need to have 1D np.array whose length is the number of rows. Each of its elements will indicate whether the row should be included.

    A fast way to get this is to use np.in1d on the second column of a. You get all elements of that column by a[:, 1]. Now you have a 1D np.array whose elements should be checked against your filter. Thats what np.in1d is for.

    So the complete code would look like:

    import numpy as np
    
    a = np.asarray([[2,'a'],[3,'b'],[4,'c'],[5,'d']])
    filter = np.asarray(['a','c'])
    a[np.in1d(a[:, 1], filter)]
    

    or in a longer form:

    import numpy as np
    
    a = np.asarray([[2,'a'],[3,'b'],[4,'c'],[5,'d']])
    filter = np.asarray(['a','c'])
    mask = np.in1d(a[:, 1], filter)
    a[mask]
    
    0 讨论(0)
提交回复
热议问题