Efficient way to count unique elements in array in numpy/scipy in Python

后端 未结 4 861
傲寒
傲寒 2021-02-02 15:16

I have a scipy array, e.g.

a = array([[0, 0, 1], [1, 1, 1], [1, 1, 1], [1, 0, 1]])

I want to count the number of occurrences of each unique ele

4条回答
  •  失恋的感觉
    2021-02-02 15:46

    You can sort the array lexicographically by rows and the look for points where the rows change:

    In [1]: a = array([[0, 0, 1], [1, 1, 1], [1, 1, 1], [1, 0, 1]])
    
    In [2]: b = a[lexsort(a.T)]
    
    In [3]: b
    Out[3]: 
    array([[0, 0, 1],
           [1, 0, 1],
           [1, 1, 1],
           [1, 1, 1]])
    
    ...
    
    
    In [5]: (b[1:] - b[:-1]).any(-1)
    Out[5]: array([ True,  True, False], dtype=bool)
    

    The last array says that the first three rows differ and the third row is repeated twice.

    For arrays of ones and zeros you can encode the values:

    In [6]: bincount(dot(a, array([4,2,1])))
    Out[6]: array([0, 1, 0, 0, 0, 1, 0, 2])
    

    Dictionaries can also be used. Which of the various methods will be fastest will depend on the sort of arrays you are actually working with.

提交回复
热议问题