What is the Pythonic/pandas way of sorting \'levels\' within a column in pandas to give a specific ordering of bars in bar plot.
For example, given:
You'll have to provide a mapping to specify how to order the day names. (If they were stored as proper dates, there would be other ways to do this.)
Updated:
Build the key. You could write out a dictionary explicitly or use something clever like this dict comprehension.
weekdays = ['Mon', 'Tues', 'Weds', 'Thurs', 'Fri', 'Sat', 'Sun']
mapping = {day: i for i, day in enumerate(weekdays)}
key = df['day'].map(mapping)
And the sorting is simple:
df.iloc[key.argsort()]
I will provide bellow code to extend Dan's answer to address the "FURTHER GENERALIZATION" section of the OP's question. First, a complete example for the simple case (just one variable) based in Dan's solution:
import pandas as pd
# Create dataframe
df=pd.DataFrame({
'group':['a','a','a','a','a','a','a','b','b','b','b','b','b','b'],
'day':['Mon','Tues','Fri','Thurs','Sat','Sun','Weds','Fri','Sun','Thurs','Sat','Weds','Mon','Tues'],
'amount':[1,2,4,2,1,1,2,4,5,3,4,2,1,3]
})
# Calculate the total amount for each day
df_grouped = df.groupby(['day']).sum().amount.reset_index()
# Use Dan's trick to order days names in the table created by groupby
weekdays = ['Mon', 'Tues', 'Weds', 'Thurs', 'Fri', 'Sat', 'Sun']
mapping = {day: i for i, day in enumerate(weekdays)}
key = df_grouped['day'].map(mapping)
df_grouped = df_grouped.iloc[key.argsort()]
# Draw the bar chart
df_grouped.plot(kind='bar', x='day')
And now, we use the same ordering technique to order the rows of the pivot table (instead of the rows created by groupby).
import pandas as pd
# Create dataframe
df=pd.DataFrame({
'group':['a','a','a','a','a','a','a','b','b','b','b','b','b','b'],
'day':['Mon','Tues','Fri','Thurs','Sat','Sun','Weds','Fri','Sun','Thurs','Sat','Weds','Mon','Tues'],
'amount':[1,2,4,2,1,1,2,4,5,3,4,2,1,3]
})
# Get the amount for each day AND EACH GROUP
df_grouped = df.groupby(['group', 'day']).sum().amount.reset_index()
# Create pivot table to get the total amount for each day and each in the proper format to plot multiple series with pandas
df_pivot = df_grouped.pivot('day','group','amount').reset_index()
# Use Dan's trick to order days names in the table created by PIVOT (not the table created by groupby, in the previous example)
weekdays = ['Mon', 'Tues', 'Weds', 'Thurs', 'Fri', 'Sat', 'Sun']
mapping = {day: i for i, day in enumerate(weekdays)}
key = df_pivot['day'].map(mapping)
df_pivot = df_pivot.iloc[key.argsort()]
# Draw the bar chart
df_pivot.plot(kind='bar', x='day')
The result is shown bellow:
I know this response is late, but a simplistic solution to the two cases presented, without use of a dictionary/mappings would be something like I've posted below.
Setting 'day' as an index enables you to use .loc to select data in a specific order
1) For the two separate plots
df=pd.DataFrame({'group':['a','a','a','a','a','a','a','b','b','b','b','b','b','b'],
'day':['Mon','Tues','Fri','Thurs','Sat','Sun','Weds','Fri','Sun','Thurs','Sat','Weds','Mon','Tues'],
'amount':[1,2,4,2,1,1,2,4,5,3,4,2,1,3]})
order = ['Mon', 'Tues', 'Weds','Thurs','Fri','Sat','Sun']`
df.set_index('day').loc[order].groupby('group').plot(kind='bar')
2) For the pivot example with the dodged plot:
order = ['Mon', 'Tues', 'Weds','Thurs','Fri','Sat','Sun']
df.pivot('day','group','amount').loc[order].plot(kind='bar')
note that pivot results in day being in the index already so you can use .loc here again.
Edit: it is best practice to use .loc instead of .ix in these solutions, .ix will be deprecated and can have weird results when column names and indexes are numbers.