I have the following data frame:
df = pandas.DataFrame([{\'c1\':3,\'c2\':10},{\'c1\':2, \'c2\':30},{\'c1\':1,\'c2\':20},{\'c1\':2,\'c2\':15},{\'c1\':2,\'c2\'
If you are writing this code as a script file then you will have to write it like this:
df = df.sort(['c1','c2'], ascending=[False,True])
DataFrame.sort
is deprecated; use DataFrame.sort_values.
>>> df.sort_values(['c1','c2'], ascending=[False,True])
c1 c2
0 3 10
3 2 15
1 2 30
4 2 100
2 1 20
>>> df.sort(['c1','c2'], ascending=[False,True])
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/Users/ampawake/anaconda/envs/pseudo/lib/python2.7/site-packages/pandas/core/generic.py", line 3614, in __getattr__
return object.__getattribute__(self, name)
AttributeError: 'DataFrame' object has no attribute 'sort'
Use of sort
can result in warning message. See github discussion.
So you might wanna use sort_values
, docs here
Then your code can look like this:
df = df.sort_values(by=['c1','c2'], ascending=[False,True])
In my case, the accepted answer didn't work:
f.sort_values(by=["c1","c2"], ascending=[False, True])
Only the following worked as expected:
f = f.sort_values(by=["c1","c2"], ascending=[False, True])
I have found this to be really useful:
df = pd.DataFrame({'A' : range(0,10) * 2, 'B' : np.random.randint(20,30,20)})
# A ascending, B descending
df.sort(**skw(columns=['A','-B']))
# A descending, B ascending
df.sort(**skw(columns=['-A','+B']))
Note that unlike the standard columns=,ascending=
arguments, here column names and their sort order are in the same place. As a result your code gets a lot easier to read and maintain.
Note the actual call to .sort
is unchanged, skw
(sortkwargs) is just a small helper function that parses the columns and returns the usual columns=
and ascending=
parameters for you. Pass it any other sort kwargs as you usually would. Copy/paste the following code into e.g. your local utils.py
then forget about it and just use it as above.
# utils.py (or anywhere else convenient to import)
def skw(columns=None, **kwargs):
""" get sort kwargs by parsing sort order given in column name """
# set default order as ascending (+)
sort_cols = ['+' + col if col[0] != '-' else col for col in columns]
# get sort kwargs
columns, ascending = zip(*[(col.replace('+', '').replace('-', ''),
False if col[0] == '-' else True)
for col in sort_cols])
kwargs.update(dict(columns=list(columns), ascending=ascending))
return kwargs
The dataframe.sort() method is - so my understanding - deprecated in pandas > 0.18. In order to solve your problem you should use dataframe.sort_values() instead:
f.sort_values(by=["c1","c2"], ascending=[False, True])
The output looks like this:
c1 c2
3 10
2 15
2 30
2 100
1 20