问题
Prerequisite
This is a question derived from this post. So, some of the introduction of the problem will be similar to that post.
Problem
Let's say result
is a 2D array and values
is a 1D array. values
holds some values associated with each element in result
. The mapping of an element in values
to result
is stored in x_mapping
and y_mapping
. A position in result
can be associated with different values. Now, I have to find the minimum and maximum of the values grouped by associations.
An example for better clarification.
min_result
array:
[[0, 0],
[0, 0],
[0, 0],
[0, 0]]
max_result
array:
[[0, 0],
[0, 0],
[0, 0],
[0, 0]]
values
array:
[ 1., 2., 3., 4., 5., 6., 7., 8.]
Note: Here result
arrays and values
have the same number of elements. But it might not be the case. There is no relation between the sizes at all.
x_mapping
and y_mapping
have mappings from 1D values
to 2D result
(both min and max). The sizes of x_mapping
, y_mapping
and values
will be the same.
x_mapping
- [0, 1, 0, 0, 0, 0, 0, 0]
y_mapping
- [0, 3, 2, 2, 0, 3, 2, 1]
Here, 1st value(values[0]
) and 5th value(values[4]
) have x as 0 and y as 0(x_mapping[0]
and y_mappping[0]
) and hence associated with result[0, 0]
. If we compute the minimum and maximum from this group, we will have 1 and 5 as results respectively. So, min_result[0, 0]
will have 1 and max_result[0, 0]
will have 5.
Note that if there is no association at all then the default value for result
will be zero.
Current working solution
x_mapping = np.array([0, 1, 0, 0, 0, 0, 0, 0])
y_mapping = np.array([0, 3, 2, 2, 0, 3, 2, 1])
values = np.array([ 1., 2., 3., 4., 5., 6., 7., 8.], dtype=np.float32)
max_result = np.zeros([4, 2], dtype=np.float32)
min_result = np.zeros([4, 2], dtype=np.float32)
min_result[-y_mapping, x_mapping] = values # randomly initialising from values
for i in range(values.size):
x = x_mapping[i]
y = y_mapping[i]
# maximum
if values[i] > max_result[-y, x]:
max_result[-y, x] = values[i]
# minimum
if values[i] < min_result[-y, x]:
min_result[-y, x] = values[i]
min_result
,
[[1., 0.],
[6., 2.],
[3., 0.],
[8., 0.]]
max_result
,
[[5., 0.],
[6., 2.],
[7., 0.],
[8., 0.]]
Failed solutions
#1
min_result = np.zeros([4, 2], dtype=np.float32)
np.minimum.reduceat(values, [-y_mapping, x_mapping], out=min_result)
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-17-126de899a90e> in <module>()
1 min_result = np.zeros([4, 2], dtype=np.float32)
----> 2 np.minimum.reduceat(values, [-y_mapping, x_mapping], out=min_result)
ValueError: object too deep for desired array
#2
min_result = np.zeros([4, 2], dtype=np.float32)
np.minimum.reduceat(values, lidx, out= min_result)
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-24-07e8c75ccaa5> in <module>()
1 min_result = np.zeros([4, 2], dtype=np.float32)
----> 2 np.minimum.reduceat(values, lidx, out= min_result)
ValueError: operands could not be broadcast together with remapped shapes [original->remapped]: (4,2)->(4,) (8,)->() (8,)->(8,)
#3
lidx = ((-y_mapping) % 4) * 2 + x_mapping #from mentioned post
min_result = np.zeros([8], dtype=np.float32)
np.minimum.reduceat(values, lidx, out= min_result).reshape(4,2)
[[1., 4.],
[5., 5.],
[1., 3.],
[5., 7.]]
Question
How to use np.minimum.reduceat
and np.maximum.reduceat
for solving this problem? I'm looking for a solution that is optimised for runtime.
Side note
I'm using Numpy version 1.14.3 with Python 3.5.2
回答1:
Approach #1
Again, the most intuitive ones would be with numpy.ufunc.at. Now, since, these reductions would be performed against the existing values, we need to initialize the output with max values for minimum reductions and min values for maximum ones. Hence, the implementation would be -
min_result[-y_mapping, x_mapping] = values.max()
max_result[-y_mapping, x_mapping] = values.min()
np.minimum.at(min_result, [-y_mapping, x_mapping], values)
np.maximum.at(max_result, [-y_mapping, x_mapping], values)
Approach #2
To leverage np.ufunc.reduceat
, we need to sort data -
m,n = max_result.shape
out_dtype = max_result.dtype
lidx = ((-y_mapping)%m)*n + x_mapping
sidx = lidx.argsort()
idx = lidx[sidx]
val = values[sidx]
m_idx = np.flatnonzero(np.r_[True,idx[:-1] != idx[1:]])
unq_ids = idx[m_idx]
max_result_out.flat[unq_ids] = np.maximum.reduceat(val, m_idx)
min_result_out.flat[unq_ids] = np.minimum.reduceat(val, m_idx)
来源:https://stackoverflow.com/questions/51704517/numpy-finding-minimum-and-maximum-values-from-associations-through-binning