Count occurrences of unique arrays in array

我是研究僧i 提交于 2020-02-25 00:44:05

问题


I have a numpy array of various one hot encoded numpy arrays, eg;

x = np.array([[1, 0, 0], [0, 0, 1], [1, 0, 0]])

I would like to count the occurances of each unique one hot vector,

{[1, 0, 0]: 2, [0, 0, 1]: 1}

回答1:


Approach #1

Seems like a perfect setup to use the new functionality of numpy.unique (v1.13 and newer) that lets us work along an axis of a NumPy array -

unq_rows, count = np.unique(x,axis=0, return_counts=1)
out = {tuple(i):j for i,j in zip(unq_rows,count)}

Sample outputs -

In [289]: unq_rows
Out[289]: 
array([[0, 0, 1],
       [1, 0, 0]])

In [290]: count
Out[290]: array([1, 2])

In [291]: {tuple(i):j for i,j in zip(unq_rows,count)}
Out[291]: {(0, 0, 1): 1, (1, 0, 0): 2}

Approach #2

For NumPy versions older than v1.13, we can make use of the fact that the input array is one-hot encoded array, like so -

_, idx, count = np.unique(x.argmax(1), return_counts=1, return_index=1)
out = {tuple(i):j for i,j in zip(x[idx],count)} # x[idx] is unq_rows



回答2:


You could convert your arrays to tuples and use a Counter:

import numpy as np
from collections import Counter
x = np.array([[1, 0, 0], [0, 0, 1], [1, 0, 0]])
Counter([tuple(a) for a in x])
# Counter({(1, 0, 0): 2, (0, 0, 1): 1})



回答3:


The fastest way given your data format is:

x.sum(axis=0)

which gives:

array([2, 0, 1])

Where the 1st result is the count of arrays where the 1st is hot:

[1, 0, 0] [2
[0, 1, 0]  0
[0, 0, 1]  1]

This exploits the fact that only one can be on at a time, so we can decompose the direct sum.

If you absolutely need it expanded to the same format, it can be converted via:

sums = x.sum(axis=0)
{tuple(int(k == i) for k in range(len(sums))): e for i, e in enumerate(sums)}

or, similarly to tarashypka:

{tuple(row): count for row, count in zip(np.eye(len(sums), dtype=np.int64), sums)}

yields:

{(1, 0, 0): 2, (0, 1, 0): 0, (0, 0, 1): 1}



回答4:


Here is another interesting solution with sum

>> {tuple(v): n for v, n in zip(np.eye(x.shape[1], dtype=int), np.sum(x, axis=0)) 
                if n > 0}
{(0, 0, 1): 1, (1, 0, 0): 2}



回答5:


Lists (including numpy arrays) are unhashable, i.e. they can't be keys of a dictionary. So your precise desired output, a dictionary with keys that look like [1, 0, 0] is never possible in Python. To deal with this you need to map your vectors to tuples.

from collections import Counter
import numpy as np

x = np.array([[1, 0, 0], [0, 0, 1], [1, 0, 0]])
counts = Counter(map(tuple, x))

That will get you:

In [12]: counts
Out[12]: Counter({(0, 0, 1): 1, (1, 0, 0): 2})


来源:https://stackoverflow.com/questions/45176383/count-occurrences-of-unique-arrays-in-array

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!