问题
I have 3D numpy array and I want only unique 2D-sub-arrays.
Input:
[[[ 1 2]
[ 3 4]]
[[ 5 6]
[ 7 8]]
[[ 9 10]
[11 12]]
[[ 5 6]
[ 7 8]]]
Output:
[[[ 1 2]
[ 3 4]]
[[ 5 6]
[ 7 8]]
[[ 9 10]
[11 12]]]
I tried convert sub-arrays to string (tostring() method) and then use np.unique, but after transform to numpy array, it deleted last bytes of \x00, so I can't transform it back with np.fromstring().
Example:
import numpy as np
a = np.array([[[1,2],[3,4]],[[5,6],[7,8]],[[9,10],[11,12]],[[5,6],[7,8]]])
b = [x.tostring() for x in a]
print(b)
c = np.array(b)
print(c)
print(np.array([np.fromstring(x) for x in c]))
Output:
[b'\x01\x00\x00\x00\x02\x00\x00\x00\x03\x00\x00\x00\x04\x00\x00\x00', b'\x05\x00\x00\x00\x06\x00\x00\x00\x07\x00\x00\x00\x08\x00\x00\x00', b'\t\x00\x00\x00\n\x00\x00\x00\x0b\x00\x00\x00\x0c\x00\x00\x00', b'\x05\x00\x00\x00\x06\x00\x00\x00\x07\x00\x00\x00\x08\x00\x00\x00']
[b'\x01\x00\x00\x00\x02\x00\x00\x00\x03\x00\x00\x00\x04'
b'\x05\x00\x00\x00\x06\x00\x00\x00\x07\x00\x00\x00\x08'
b'\t\x00\x00\x00\n\x00\x00\x00\x0b\x00\x00\x00\x0c'
b'\x05\x00\x00\x00\x06\x00\x00\x00\x07\x00\x00\x00\x08']
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-86-6772b096689f> in <module>()
5 c = np.array(b)
6 print(c)
----> 7 print(np.array([np.fromstring(x) for x in c]))
<ipython-input-86-6772b096689f> in <listcomp>(.0)
5 c = np.array(b)
6 print(c)
----> 7 print(np.array([np.fromstring(x) for x in c]))
ValueError: string size must be a multiple of element size
I also tried view, but I realy don't know how to use it. Can you help me please?
回答1:
Using @Jaime's post, to solve our case of finding unique 2D subarrays, I came up with this solution that basically adds a reshaping to the view
step -
def unique2D_subarray(a):
dtype1 = np.dtype((np.void, a.dtype.itemsize * np.prod(a.shape[1:])))
b = np.ascontiguousarray(a.reshape(a.shape[0],-1)).view(dtype1)
return a[np.unique(b, return_index=1)[1]]
Sample run -
In [62]: a
Out[62]:
array([[[ 1, 2],
[ 3, 4]],
[[ 5, 6],
[ 7, 8]],
[[ 9, 10],
[11, 12]],
[[ 5, 6],
[ 7, 8]]])
In [63]: unique2D_subarray(a)
Out[63]:
array([[[ 1, 2],
[ 3, 4]],
[[ 5, 6],
[ 7, 8]],
[[ 9, 10],
[11, 12]]])
回答2:
The numpy_indexed package (disclaimer: I am its author) is designed to do operations such as these in an efficient and vectorized manner:
import numpy_indexed as npi
npi.unique(a)
回答3:
One solution would be to use a set to keep track of which sub arrays you have seen:
seen = set([])
new_a = []
for j in a:
f = tuple(list(j.flatten()))
if f not in seen:
new_a.append(j)
seen.add(f)
print np.array(new_a)
Or using numpy only:
print np.unique(a).reshape((len(unique) / 4, 2, 2))
>>> [[[ 1 2]
[ 3 4]]
[[ 5 6]
[ 7 8]]
[[ 9 10]
[11 12]]]
来源:https://stackoverflow.com/questions/40674696/numpy-unique-2d-sub-array