Pandas: Find index of the row with second highest value

后端 未结 2 1590
鱼传尺愫
鱼传尺愫 2021-01-14 01:16

I am trying to get the index of the row with the second highest value after doing groupby but I am not getting the right result

df = pd.DataFrame({\'Sp\':[\'         


        
相关标签:
2条回答
  • 2021-01-14 01:41

    Since 'Value' is already sorted you can use nth:

    In [11]: g = df.groupby("Mt", as_index=False)
    
    In [12]: g.nth(-2)
    Out[12]:
       Mt Sp  Value  count
    0  s1  a      1      3
    3  s2  d      4     10
    

    Otherwise I'd first sort by Value, df = df.sort_values("Value").

    If you want the last (if there are fewer than two in a given group), you could grab that too

    In [21]: g = df.groupby("Mt")
    
    In [22]: res = g.nth(-1)
    
    In [23]: res.update(g.nth(-2))
    
    In [24]: res
    Out[24]:
       Sp  Value  count
    Mt
    s1  a      1      3
    s2  d      4     10
    s3  f      6      6
    

    A related function is tail (to get the last two elements):

    In [31]: g.tail(2)
    Out[31]:
       Mt Sp  Value  count
    0  s1  a      1      3
    1  s1  b      2      2
    3  s2  d      4     10
    4  s2  e      5     10
    5  s3  f      6      6
    
    0 讨论(0)
  • 2021-01-14 01:54

    OK I got the answer except for one thing. This code seems to work

    df.iloc[df.groupby(['Mt'])['Value'].apply(lambda x: (x!=max(x)).order(ascending=False).head(1).index[0])]
    

    The only thing I dont understand right now that even with a group of one row only that row is being returned. I was thinking that may be x!=max(x) check would exclude that row.

    0 讨论(0)
提交回复
热议问题