发表新帖

发表新帖

Pandas: how to sort dataframe by column AND by index

前端未结

关注

 3  619

不思量自难忘°

Given the DataFrame:

import pandas as pd
df = pd.DataFrame([6, 4, 2, 4, 5], index=[2, 6, 3, 4, 5], columns=[\'A\'])

Results in:

相关标签:

3条回答

失恋的感觉

2021-01-05 18:37

Using lexsort from numpy may be other way and little faster as well:

df.iloc[np.lexsort((df.index, df.A.values))] # Sort by A.values, then by index

Result:

Comparing with timeit:

%%timeit
df.iloc[np.lexsort((df.index, df.A.values))] # Sort by A.values, then by index

Result:

1000 loops, best of 3: 278 µs per loop

With reset index and set index again:

 %%timeit
df.reset_index().sort_values(by=['A','index']).set_index('index')

Result:

100 loops, best of 3: 2.09 ms per loop

0 讨论(0)

生来不讨喜

2021-01-05 18:41
The other answers are great. I'll throw in one other option, which is to provide a name for the index first using rename_axis and then reference it in sort_values. I have not tested the performance but expect the accepted answer to still be faster.

df.rename_axis('idx').sort_values(by=['A', 'idx'])
```
     A
idx   
3    2
4    4
6    4
5    5
2    6
```
You can clear the index name afterward if you want with df.index.name = None.
0 讨论(0)
发布评论:

提交评论
- 加载中...
不知归路

2021-01-05 18:59
You can sort by index and then by column A using kind='mergesort'.

This works because mergesort is stable.
```
res = df.sort_index().sort_values('A', kind='mergesort')
```
Result:
```
   A
3  2
4  4
6  4
5  5
2  6
```
0 讨论(0)
发布评论:

提交评论
- 加载中...

热议问题