I have a data frame like this:
print(df)
0 1 2
0 354.7 April 4.0
1 55.4 August 8.0
2 176.5 December 12.0
3 9
Use sort_values to sort the df by a specific column's values:
In [18]:
df.sort_values('2')
Out[18]:
0 1 2
4 85.6 January 1.0
3 95.5 February 2.0
7 104.8 March 3.0
0 354.7 April 4.0
8 283.5 May 5.0
6 238.7 June 6.0
5 152.0 July 7.0
1 55.4 August 8.0
11 212.7 September 9.0
10 249.6 October 10.0
9 278.8 November 11.0
2 176.5 December 12.0
If you want to sort by two columns, pass a list of column labels to sort_values
with the column labels ordered according to sort priority. If you use df.sort_values(['2', '0'])
, the result would be sorted by column 2
then column 0
. Granted, this does not really make sense for this example because each value in df['2']
is unique.
I tried the solutions above and I do not achieve results, so I found a different solution that works for me. The ascending=False is to order the dataframe in descending order, by default it is True. I am using python 3.6.6 and pandas 0.23.4 versions.
final_df = df.sort_values(by=['2'], ascending=False)
You can see more details in pandas documentation here.
Here is template of sort_values according to pandas documentation.
DataFrame.sort_values(by, axis=0,
ascending=True,
inplace=False,
kind='quicksort',
na_position='last',
ignore_index=False, key=None)[source]
In this case it will be like this.
df.sort_values(by=['2'])
API Reference pandas.DataFrame.sort_values
Just adding some more operations on data. Suppose we have a dataframe df
, we can do several operations to get desired outputs
ID cost tax label
1 216590 1600 test
2 523213 1800 test
3 250 1500 experiment
(df['label'].value_counts().to_frame().reset_index()).sort_values('label', ascending=False)
will give sorted
output of labels as a dataframe
index label
0 test 2
1 experiment 1
Using column name worked for me.
sorted_df = df.sort_values(by=['Column_name'], ascending=True)
Just as another solution:
Instead of creating the second column, you can categorize your string data(month name) and sort by that like this:
df.rename(columns={1:'month'},inplace=True)
df['month'] = pd.Categorical(df['month'],categories=['December','November','October','September','August','July','June','May','April','March','February','January'],ordered=True)
df = df.sort_values('month',ascending=False)
It will give you the ordered data by month name
as you specified while creating the Categorical
object.