Efficient way to count unique elements in array in numpy/scipy in Python

后端未结

关注

 4  870

I have a scipy array, e.g.

a = array([[0, 0, 1], [1, 1, 1], [1, 1, 1], [1, 0, 1]])

I want to count the number of occurrences of each unique ele

相关标签:

4条回答

栀梦

2021-02-02 15:24

for python 2.6 <

import itertools

data_array = [[0, 0, 1], [1, 1, 1], [1, 1, 1], [1, 0, 1]]

dict_ = {}

for list_, count in itertools.groupby(data_array):
    dict_.update({tuple(list_), len(list(count))})

0 讨论(0)

梦谈多话

2021-02-02 15:30
The numpy_indexed package (disclaimer: I am its author) provides a solution similar to the one posted by chuck; which is a nicely vectorized one. But with tests, a nice interface, and many more related useful functions:
```
import numpy_indexed as npi
npi.count(a)
```
0 讨论(0)
发布评论:

提交评论
- 加载中...
灰色年华

2021-02-02 15:33
If sticking with Python 2.7 (or 3.1) is not an issue and any of these two Python versions is available to you, perhaps the new collections.Counter might be something for you if you stick to hashable elements like tuples:
```
>>> from collections import Counter
>>> c = Counter([(0,0,1), (1,1,1), (1,1,1), (1,0,1)])
>>> c
Counter({(1, 1, 1): 2, (0, 0, 1): 1, (1, 0, 1): 1})
```
I haven't done any performance testing on these two approaches, though.
0 讨论(0)
发布评论:

提交评论
- 加载中...
失恋的感觉

2021-02-02 15:46
You can sort the array lexicographically by rows and the look for points where the rows change:
```
In [1]: a = array([[0, 0, 1], [1, 1, 1], [1, 1, 1], [1, 0, 1]])

In [2]: b = a[lexsort(a.T)]

In [3]: b
Out[3]: 
array([[0, 0, 1],
       [1, 0, 1],
       [1, 1, 1],
       [1, 1, 1]])

...


In [5]: (b[1:] - b[:-1]).any(-1)
Out[5]: array([ True,  True, False], dtype=bool)
```
The last array says that the first three rows differ and the third row is repeated twice.

For arrays of ones and zeros you can encode the values:
```
In [6]: bincount(dot(a, array([4,2,1])))
Out[6]: array([0, 1, 0, 0, 0, 1, 0, 2])
```
Dictionaries can also be used. Which of the various methods will be fastest will depend on the sort of arrays you are actually working with.
0 讨论(0)
发布评论:

提交评论
- 加载中...