问题
I'm looking for a way to plot a distribution histogram, with the y-axis
representing the total number of items for each bin (and not just the count).
Example on the charts below:
- On the left, there are 55 agencies who sold between 20-30 houses
- On the right, the agencies having sold between 20-30 houses represent 1100 houses sold
It's not as trivial as it looks because one can't simply multiply each bin's count by the bin's value (maybe in the 20-30 bin, there are 54 agencies who sold 21 are 1 who sold 29).
Questions:
- What is the name of such a chart (the one on the right)?
- Is there a way to plot it natively in
matplotlib
orseaborn
?
回答1:
You want to use the weight
kwarg (see numpy docs) which is passed through ax.hist
(see).
Something like
fig, ax = plt.subplots()
ax.hist(num_sold, bins, weights=num_sold)
回答2:
Edit: @tacaswell is better use it. But the labels for mine will line up correctly without hassle and the bars will be separated.
Hopefully your data is in pandas. I will create some fake data and then give you a solution.
import pandas as pd
# create a dataframe of number of homes sold
df = pd.DataFrame(data={'sold':np.random.randint(0,100, 1000)})
# groupby the left side of interval [0, 10), [10, 20) etc.. and plot
df.groupby(df.sold // 10 * 10).sum().plot.bar()
来源:https://stackoverflow.com/questions/41252078/weighted-bins-in-a-distribution-hist-plot