Binning of data along one axis in numpy

走远了吗. 提交于 2019-12-20 19:37:19

问题


I have a large two dimensional array arr which I would like to bin over the second axis using numpy. Because np.histogram flattens the array I'm currently using a for loop:

import numpy as np

arr = np.random.randn(100, 100)

nbins = 10
binned = np.empty((arr.shape[0], nbins))

for i in range(arr.shape[0]):
    binned[i,:] = np.histogram(arr[i,:], bins=nbins)[0]

I feel like there should be a more direct and more efficient way to do that within numpy but I failed to find one.


回答1:


You could use np.apply_along_axis:

x = np.array([range(20), range(1, 21), range(2, 22)])

nbins = 2
>>> np.apply_along_axis(lambda a: np.histogram(a, bins=nbins)[0], 1, x)
array([[10, 10],
       [10, 10],
       [10, 10]])

The main advantage (if any) is that it's slightly shorter, but I wouldn't expect much of a performance gain. It's possibly marginally more efficient in the assembly of the per-row results.




回答2:


I was a bit confused by the lambda in Ami's solution so I expanded it out to show what it's doing:

def hist_1d(a):
    return np.histogram(a, bins=bins)[0]

counts = np.apply_along_axis(hist_1d, axis=1, arr=x)



回答3:


You have to use numpy.histogramdd specifically meant for your problem



来源:https://stackoverflow.com/questions/40018125/binning-of-data-along-one-axis-in-numpy

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!