More bizarre results using: groupby and nlargest() in pandas

后端 未结 1 1103
眼角桃花
眼角桃花 2021-01-14 09:37

This question is an extension of the following post: select largest N of a column of each groupby group using pandas

Lets use the same df and the workaround proposed

相关标签:
1条回答
  • 2021-01-14 09:55

    Try this:

    In [76]: df.groupby(cols2)['p234_r_c'].nlargest(1).reset_index(level=3, drop=True).reset_index()
    Out[76]:
         city1 plant1_type plant2_type  p234_r_c
    0   Austin        COAL        NUKE       3.0
    1  Chicago        COAL    COMBCYCL       0.5
    2  Chicago    COMBCYCL        COAL       5.0
    3  Chicago        NUKE    COMBCYCL       2.0
    4  Houston    COMBCYCL        NUKE       4.0
    5    Miami        NUKE        COAL       1.0
    

    Frankly speaking I don't understand the following behavior:

    In [77]: df.set_index(cols2).groupby(level=cols2)['p234_r_c'].nlargest(1)
    Out[77]:
    city1    plant1_type  plant2_type  city1    plant1_type  plant2_type
    Austin   COAL         NUKE         Austin   COAL         NUKE           3.0
    Chicago  COAL         COMBCYCL     Chicago  COAL         COMBCYCL       0.5
             COMBCYCL     COAL         Chicago  COMBCYCL     COAL           5.0
             NUKE         COMBCYCL     Chicago  NUKE         COMBCYCL       2.0
    Houston  COMBCYCL     NUKE         Houston  COMBCYCL     NUKE           4.0
    Miami    NUKE         COAL         Miami    NUKE         COAL           1.0
    Name: p234_r_c, dtype: float64
    

    where:

    In [78]: cols2
    Out[78]: ['city1', 'plant1_type', 'plant2_type']
    
    0 讨论(0)
提交回复
热议问题