pandas: groupby and variable weights

前端未结

关注

 1  1365

I have a dataset with weights for each observation and I want to prepare weighted summaries using groupby but am rusty as to how to best do this. I think it imp

相关标签:

1条回答

谎友^

2021-01-14 06:38

Simply multiply the two columns:

In [11]: df_city['weighted_jobs'] = df_city['weight'] * df_city['jobs']

Now you can groupby the city (and take the sum):

In [12]: df_city_sums = df_city.groupby('city').sum()

In [13]: df_city_sums
Out[13]: 
           jobs  weight  weighted_jobs
city                                  
oakland     362     690           7958
san mateo   367    1017           9026
sf          253     638           6209

[3 rows x 3 columns]

Now you can divide the two sums, to get the desired result:

In [14]: df_city_sums['weighted_jobs'] / df_city_sums['jobs']
Out[14]: 
city
oakland      21.983425
san mateo    24.594005
sf           24.541502
dtype: float64

0 讨论(0)