pandas: groupby and variable weights

前端 未结 1 1365
旧巷少年郎
旧巷少年郎 2021-01-14 06:16

I have a dataset with weights for each observation and I want to prepare weighted summaries using groupby but am rusty as to how to best do this. I think it imp

相关标签:
1条回答
  • 2021-01-14 06:38

    Simply multiply the two columns:

    In [11]: df_city['weighted_jobs'] = df_city['weight'] * df_city['jobs']
    

    Now you can groupby the city (and take the sum):

    In [12]: df_city_sums = df_city.groupby('city').sum()
    
    In [13]: df_city_sums
    Out[13]: 
               jobs  weight  weighted_jobs
    city                                  
    oakland     362     690           7958
    san mateo   367    1017           9026
    sf          253     638           6209
    
    [3 rows x 3 columns]
    

    Now you can divide the two sums, to get the desired result:

    In [14]: df_city_sums['weighted_jobs'] / df_city_sums['jobs']
    Out[14]: 
    city
    oakland      21.983425
    san mateo    24.594005
    sf           24.541502
    dtype: float64
    
    0 讨论(0)
提交回复
热议问题