You can use sorted CategoricalIndex with sort_index:
cats = ['Jan', 'Feb', 'Mar', 'Apr','May','Jun', 'Jul', 'Aug','Sep', 'Oct', 'Nov', 'Dec']
df.index = pd.CategoricalIndex(df.index, categories=cats, ordered=True)
df = df.sort_index()
print (df)
date
Jan 2
Feb 1
Apr 1
May 1
Jun 1
Jul 1
Aug 2
Sep 2
Oct 14
Nov 36
Dec 47
Or use DataFrame.reindex - but if some value is missing add NaNs rows:
df = df.reindex(cats)
Adding to the very helpful answer by @jezrael:
In pandas 0.25.1 sorted
has been replaced by ordered
per pandas.CategoricalIndex
Old way:
df.index = pd.CategoricalIndex(df.index,
categories=['Jan', 'Feb', 'Mar', 'Apr','May','Jun', 'Jul', 'Aug','Sep', 'Oct', 'Nov', 'Dec'],
sorted=True)
df = df.sort_index()
Error
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-468-3f0ab66734d4> in <module>
2 net.index = pd.CategoricalIndex(net.index,
3 categories=['Jan', 'Feb', 'Mar', 'Apr','May','Jun', 'Jul', 'Aug','Sep', 'Oct', 'Nov', 'Dec'],
----> 4 sorted=True)
5 net = net.sort_index()
6 net
TypeError: __new__() got an unexpected keyword argument 'sorted'
New way:
df.index = pd.CategoricalIndex(df.index,
categories=['Jan', 'Feb', 'Mar', 'Apr','May','Jun', 'Jul', 'Aug','Sep', 'Oct', 'Nov', 'Dec'],
ordered=True)
df = df.sort_index()
Okay it was not very complex. I'm sure Categorical would have worked just that I was unable to solve the problem using Categorical. What I did was-
I'm sure there are more efficient ways of solving this, so if you have a better way please post the same.
import calendar
months = release_dates[release_dates.title.str.contains('Christmas') & (release_dates.country=='USA')].date.dt.month
counts = months.value_counts()
counts.sort_index(inplace=True)
counts.index = map(lambda x: calendar.month_abbr[x], counts.index)
counts.plot.bar()