问题
a=b=np.arange(9).reshape(3,3)
i=np.arange(3)
mask=a<i[:,None,None]+3
and
b[np.where(mask[0])]
>>>array([0, 1, 2])
b[np.where(mask[1])]
>>>array([0, 1, 2, 3])
b[np.where(mask[2])]
>>>array([0, 1, 2, 3, 4])
Now I wanna vectorize it and print them all together, and I try
b[np.where(mask[i])]
and b[np.where(mask[i[:,None,None]])]
Both of them show IndexError: too many indices for array
回答1:
In [165]: a
Out[165]:
array([[0, 1, 2],
[3, 4, 5],
[6, 7, 8]])
In [166]: mask
Out[166]:
array([[[ True, True, True],
[False, False, False],
[False, False, False]],
[[ True, True, True],
[ True, False, False],
[False, False, False]],
[[ True, True, True],
[ True, True, False],
[False, False, False]]], dtype=bool)
So a
(and b
) is (3,3), while mask
is (3,3,3).
A boolean mask, applied to an array produces a 1d (same when applied via where
):
In [170]: a[mask[1,:,:]]
Out[170]: array([0, 1, 2, 3])
The where
on the 2d mask produces a 2 element tuple, which can index the 2d array:
In [173]: np.where(mask[1,:,:])
Out[173]: (array([0, 0, 0, 1], dtype=int32), array([0, 1, 2, 0], dtype=int32))
where
on the 3d mask is a 3 element tuple - hence the too many indices
error:
In [174]: np.where(mask)
Out[174]:
(array([0, 0, 0, 1, 1, 1, 1, 2, 2, 2, 2, 2], dtype=int32),
array([0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 1, 1], dtype=int32),
array([0, 1, 2, 0, 1, 2, 0, 0, 1, 2, 0, 1], dtype=int32))
Let's try expanding a
to 3d and apply the mask
In [176]: np.tile(a[None,:],(3,1,1)).shape
Out[176]: (3, 3, 3)
In [177]: np.tile(a[None,:],(3,1,1))[mask]
Out[177]: array([0, 1, 2, 0, 1, 2, 3, 0, 1, 2, 3, 4])
The values are there, but they are joined.
We can count the number of True
in each plane of mask
, and use that to split
the masked tile:
In [185]: mask.sum(axis=(1,2))
Out[185]: array([3, 4, 5])
In [186]: cnt=np.cumsum(mask.sum(axis=(1,2)))
In [187]: cnt
Out[187]: array([ 3, 7, 12], dtype=int32)
In [189]: np.split(np.tile(a[None,:],(3,1,1))[mask], cnt[:-1])
Out[189]: [array([0, 1, 2]), array([0, 1, 2, 3]), array([0, 1, 2, 3, 4])]
Internally np.split
uses a Python level iteration. So iteration on the mask
planes might be just as good (6x faster on this small example).
In [190]: [a[m] for m in mask]
Out[190]: [array([0, 1, 2]), array([0, 1, 2, 3]), array([0, 1, 2, 3, 4])]
That points to a fundamental problem with the desired 'vectorization', the individual arrays are (3,), (4,) and (5,) shape. Differing size arrays is a strong indicator that true 'vectorization' is difficult if not impossible.
回答2:
When trying to print a vector it can only exist in the x, y and z dimensions. You have 4.
来源:https://stackoverflow.com/questions/46781282/vectorization-too-many-indices-for-array