问题
I am using numpy_indexed for applying a vectorized numpy bincount, as follows:
import numpy as np
import numpy_indexed as npi
rowidx, colidx = np.indices(index_tri.shape)
(cols, rows), B = npi.count((index_tri.flatten(), rowidx.flatten()))
where index_tri
is the following matrix:
index_tri = np.array([[ 0, 0, 0, 7, 1, 3],
[ 1, 2, 2, 9, 8, 9],
[ 3, 1, 1, 4, 9, 1],
[ 5, 6, 6, 10, 10, 10],
[ 7, 8, 9, 4, 3, 3],
[ 3, 8, 6, 3, 8, 6],
[ 4, 3, 3, 7, 8, 9],
[10, 10, 10, 5, 6, 6],
[ 4, 9, 1, 3, 1, 1],
[ 9, 8, 9, 1, 2, 2]])
Then I map the binned values in the corresponding position of the following initialized matrix m
:
m = np.zeros((10,11))
m
array([[0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
[0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
[0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
[0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
[0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
[0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
[0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
[0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
[0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
[0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.]])
m[rows, cols] = B
m
array([[3., 1., 0., 1., 0., 0., 0., 1., 0., 0., 0.],
[0., 1., 2., 0., 0., 0., 0., 0., 1., 2., 0.],
[0., 3., 0., 1., 1., 0., 0., 0., 0., 1., 0.],
[0., 0., 0., 0., 0., 1., 2., 0., 0., 0., 3.],
[0., 0., 0., 2., 1., 0., 0., 1., 1., 1., 0.],
[0., 0., 0., 2., 0., 0., 2., 0., 2., 0., 0.],
[0., 0., 0., 2., 1., 0., 0., 1., 1., 1., 0.],
[0., 0., 0., 0., 0., 1., 2., 0., 0., 0., 3.],
[0., 3., 0., 1., 1., 0., 0., 0., 0., 1., 0.],
[0., 1., 2., 0., 0., 0., 0., 0., 1., 2., 0.]])
However, this considers that the weight of each value in index_tri
per column is 1. Now if I have a weights array, providing a corresponding weight value per column in index_tri
instead of 1:
weights = np.array([0.7, 0.8, 1.5, 0.6, 0.5, 1.9])
how to apply a weighted bincount so that my output matrix m
becomes as follows:
array([[3., 0.5, 0., 1.9, 0., 0., 0., 0.6, 0., 0., 0.],
[0., 0.7, 2.3, 0., 0., 0., 0., 0., 0.5, 2.5, 0.],
[0., 4.2, 0., 0.7, 0.6, 0., 0., 0., 0., 0.5, 0.],
[0., 0., 0., 0., 0., 0.7, 2.3, 0., 0., 0., 3.],
[0., 0., 0., 2.4, 0.6, 0., 0., 0.7, 0.8, 1.5, 0.],
[0., 0., 0., 2.3, 0., 0., 2.4, 0., 1.3, 0., 0.],
[0., 0., 0., 2.3, 0.7, 0., 0., 0.6, 0.5, 1.9, 0.],
[0., 0., 0., 0., 0., 0.6, 2.4, 0., 0., 0., 3.],
[0., 3.9, 0., 0.6, 0.7, 0., 0., 0., 0., 0.8, 0.],
[0., 0.6, 2.4, 0., 0., 0., 0., 0., 0.8, 2.2, 0.]])
any idea?
By using a for
loop and the numpy bincount()
I could solve it as follows:
for i in range(m.shape[0]):
m[i, :] = np.bincount(index_tri[i, :], weights=weights, minlength=m.shape[1])
I am trying to adapt the vectorized provided solution from here and here respectively but I cannot figure out what the ix2D
variable corresponds to in the first link. Could someone elaborate a bit if possible.
Update (solution):
Based on the @Divakar's solution below, here is an updated version where it takes an extra input parameter in case that your indices input matrix does not cover the full range of the output initialized matrix:
def bincount2D(id_ar_2D, weights_1D, sz=None):
# Inputs : 2D id array, 1D weights array
# Extent of bins per col
if sz == None:
n = id_ar_2D.max() + 1
N = len(id_ar_2D)
else:
n = sz[1]
N = sz[0]
# add offsets to the original values to be used when we apply raveling later on
id_ar_2D_offsetted = id_ar_2D + n * np.arange(N)[:, None]
# Finally use bincount with those 2D bins as flattened and with
# flattened b as weights. Reshaping is needed to add back into "a".
ids = id_ar_2D_offsetted.ravel()
W = np.tile(weights_1D, N)
return np.bincount(ids, W, minlength=n * N).reshape(-1, n)
回答1:
Inspired by this post -
def bincount2D(id_ar_2D, weights_1D):
# Inputs : 2D id array, 1D weights array
# Extent of bins per col
n = id_ar_2D.max()+1
N = len(id_ar_2D)
id_ar_2D_offsetted = id_ar_2D + n*np.arange(N)[:,None]
# Finally use bincount with those 2D bins as flattened and with
# flattened b as weights. Reshaping is needed to add back into "a".
ids = id_ar_2D_offsetted.ravel()
W = np.tile(weights_1D,N)
return np.bincount(ids, W, minlength=n*N).reshape(-1,n)
out = bincount2D(index_tri, weights)
来源:https://stackoverflow.com/questions/62719951/weighted-numpy-bincount-for-2d-ids-array-and-1d-weights