Count all values in a matrix greater than a value

前端 未结 5 1027
一向
一向 2020-12-08 03:57

I have to count all the values in a matrix (2-d array) that are greater than 200.

The code I wrote down for this is:

za=0   
p31 = numpy.asarray(o31)         


        
相关标签:
5条回答
  • 2020-12-08 04:09

    To count the number of values larger than x in any numpy array you can use:

    n = len(matrix[matrix > x])
    

    The boolean indexing returns an array that contains only the elements where the condition (matrix > x) is met. Then len() counts these values.

    0 讨论(0)
  • 2020-12-08 04:13

    There are many ways to achieve this, like flatten-and-filter or simply enumerate, but I think using Boolean/mask array is the easiest one (and iirc a much faster one):

    >>> y = np.array([[123,24123,32432], [234,24,23]])
    array([[  123, 24123, 32432],
           [  234,    24,    23]])
    >>> b = y > 200
    >>> b
    array([[False,  True,  True],
           [ True, False, False]], dtype=bool)
    >>> y[b]
    array([24123, 32432,   234])
    >>> len(y[b])
    3
    >>>> y[b].sum()
    56789
    

    Update:

    As nneonneo has answered, if all you want is the number of elements that passes threshold, you can simply do:

    >>>> (y>200).sum()
    3
    

    which is a simpler solution.


    Speed comparison with filter:

    ### use boolean/mask array ###
    
    b = y > 200
    
    %timeit y[b]
    100000 loops, best of 3: 3.31 us per loop
    
    %timeit y[y>200]
    100000 loops, best of 3: 7.57 us per loop
    
    ### use filter ###
    
    x = y.ravel()
    %timeit filter(lambda x:x>200, x)
    100000 loops, best of 3: 9.33 us per loop
    
    %timeit np.array(filter(lambda x:x>200, x))
    10000 loops, best of 3: 21.7 us per loop
    
    %timeit filter(lambda x:x>200, y.ravel())
    100000 loops, best of 3: 11.2 us per loop
    
    %timeit np.array(filter(lambda x:x>200, y.ravel()))
    10000 loops, best of 3: 22.9 us per loop
    
    *** use numpy.where ***
    
    nb = np.where(y>200)
    %timeit y[nb]
    100000 loops, best of 3: 2.42 us per loop
    
    %timeit y[np.where(y>200)]
    100000 loops, best of 3: 10.3 us per loop
    
    0 讨论(0)
  • 2020-12-08 04:16

    The numpy.where function is your friend. Because it's implemented to take full advantage of the array datatype, for large images you should notice a speed improvement over the pure python solution you provide.

    Using numpy.where directly will yield a boolean mask indicating whether certain values match your conditions:

    >>> data
    array([[1, 8],
           [3, 4]])
    >>> numpy.where( data > 3 )
    (array([0, 1]), array([1, 1]))
    

    And the mask can be used to index the array directly to get the actual values:

    >>> data[ numpy.where( data > 3 ) ]
    array([8, 4])
    

    Exactly where you take it from there will depend on what form you'd like the results in.

    0 讨论(0)
  • 2020-12-08 04:18

    This is very straightforward with boolean arrays:

    p31 = numpy.asarray(o31)
    za = (p31 < 200).sum() # p31<200 is a boolean array, so `sum` counts the number of True elements
    
    0 讨论(0)
  • 2020-12-08 04:31

    Here's a variant that uses fancy indexing and has the actual values as an intermediate:

    p31 = numpy.asarray(o31)
    values = p31[p31<200]
    za = len(values)
    
    0 讨论(0)
提交回复
热议问题