Given the DataFrame:
import pandas as pd
df = pd.DataFrame([6, 4, 2, 4, 5], index=[2, 6, 3, 4, 5], columns=[\'A\'])
Results in:
<
Using lexsort
from numpy may be other way and little faster as well:
df.iloc[np.lexsort((df.index, df.A.values))] # Sort by A.values, then by index
Result:
A
3 2
4 4
6 4
5 5
2 6
Comparing with timeit
:
%%timeit
df.iloc[np.lexsort((df.index, df.A.values))] # Sort by A.values, then by index
Result:
1000 loops, best of 3: 278 µs per loop
With reset index and set index again:
%%timeit
df.reset_index().sort_values(by=['A','index']).set_index('index')
Result:
100 loops, best of 3: 2.09 ms per loop
The other answers are great. I'll throw in one other option, which is to provide a name for the index first using rename_axis and then reference it in sort_values
. I have not tested the performance but expect the accepted answer to still be faster.
df.rename_axis('idx').sort_values(by=['A', 'idx'])
A
idx
3 2
4 4
6 4
5 5
2 6
You can clear the index name afterward if you want with df.index.name = None
.
You can sort by index and then by column A using kind='mergesort'
.
This works because mergesort is stable.
res = df.sort_index().sort_values('A', kind='mergesort')
Result:
A
3 2
4 4
6 4
5 5
2 6