Weighted bins in a distribution hist plot

守給你的承諾、 提交于 2020-12-31 02:41:16

问题


I'm looking for a way to plot a distribution histogram, with the y-axis representing the total number of items for each bin (and not just the count).

Example on the charts below:

  • On the left, there are 55 agencies who sold between 20-30 houses
  • On the right, the agencies having sold between 20-30 houses represent 1100 houses sold

It's not as trivial as it looks because one can't simply multiply each bin's count by the bin's value (maybe in the 20-30 bin, there are 54 agencies who sold 21 are 1 who sold 29).

Questions:

  • What is the name of such a chart (the one on the right)?
  • Is there a way to plot it natively in matplotlib or seaborn?

回答1:


You want to use the weight kwarg (see numpy docs) which is passed through ax.hist (see).

Something like

fig, ax = plt.subplots()
ax.hist(num_sold, bins, weights=num_sold)



回答2:


Edit: @tacaswell is better use it. But the labels for mine will line up correctly without hassle and the bars will be separated.

Hopefully your data is in pandas. I will create some fake data and then give you a solution.

import pandas as pd

# create a dataframe of number of homes sold
df = pd.DataFrame(data={'sold':np.random.randint(0,100, 1000)})

# groupby the left side of interval [0, 10), [10, 20) etc..  and plot
df.groupby(df.sold // 10 * 10).sum().plot.bar()


来源:https://stackoverflow.com/questions/41252078/weighted-bins-in-a-distribution-hist-plot

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!