问题
I need some help in binning my data values. Need a histogram-like function, but I don't want to list the occurrences, just the sum of the values for each bin.
In my example below I have a list with the number of Twitter followers for 30 days. Lets say I want 10 bins, then each bin would take the values of 30 / 10 = 3 days. For the first three days the value for bin 1 would be 1391 + 142 + 0 = 1533 for bin 2 12618, etc., up to bin 10.
The number of bins as well as the duration could eventually be varied. It also needs to work for a duration of 31 days and 5 bins, for instance.
Anyone knows how to do this efficiently? Is there a Python function available that could do this? Otherwise an implementation of a for loop that is able to sum n number of values in a list together until end of duration.
All help would be highly appreciated :) Thanks!
followersList = [1391, 142, 0, 0, 12618, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 456, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
duration = 30
bins = 10
binWidth = round(duration / bins)
#
# for loop or python function that sums values for each bin
#
回答1:
You can do it like this:
bin_width = int(round(duration / bins))
followers = [sum(followersList[i:i+bin_width]) for i in xrange(0, duration, bin_width)]
回答2:
Another way of doing is by reshape and sum. I know that you already have a valid answer but you need to practice a lot with numpy list operations
import numpy
# this works when the list divides exactly into bins
followersList = [1391, 142, 0, 0, 12618, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 456, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
duration = len(followersList)
bins = 10
binWidth = round(duration / bins)
print(numpy.array(followersList).reshape(bins, binWidth).sum(axis=1))
# otherwhise we have to pad with zero till its a multiple of containers
followersList = [1391, 142, 0, 0, 12618, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 456, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1]
binWidth = 3
bins = (len(followersList) - 1) // binWidth + 1 # ceiling division
print(
numpy.pad(followersList, (0, bins * binWidth - len(followersList)), 'constant').reshape(bins, binWidth).sum(axis=1))
来源:https://stackoverflow.com/questions/33651218/how-to-bin-the-sum-of-list-values-in-python