I am trying to get the index of the row with the second highest value after doing groupby but I am not getting the right result
df = pd.DataFrame({\'Sp\':[\'
Since 'Value' is already sorted you can use nth:
In [11]: g = df.groupby("Mt", as_index=False)
In [12]: g.nth(-2)
Out[12]:
Mt Sp Value count
0 s1 a 1 3
3 s2 d 4 10
Otherwise I'd first sort by Value, df = df.sort_values("Value")
.
If you want the last (if there are fewer than two in a given group), you could grab that too
In [21]: g = df.groupby("Mt")
In [22]: res = g.nth(-1)
In [23]: res.update(g.nth(-2))
In [24]: res
Out[24]:
Sp Value count
Mt
s1 a 1 3
s2 d 4 10
s3 f 6 6
A related function is tail (to get the last two elements):
In [31]: g.tail(2)
Out[31]:
Mt Sp Value count
0 s1 a 1 3
1 s1 b 2 2
3 s2 d 4 10
4 s2 e 5 10
5 s3 f 6 6
OK I got the answer except for one thing. This code seems to work
df.iloc[df.groupby(['Mt'])['Value'].apply(lambda x: (x!=max(x)).order(ascending=False).head(1).index[0])]
The only thing I dont understand right now that even with a group of one row only that row is being returned. I was thinking that may be x!=max(x)
check would exclude that row.