I'd like to add subtotal rows for index #1 (ie. Fruits and Animal) and subtotal columns for columns (ie. 2015 and 2016).
For the subtotal columns, I could do something like this, but it seems inefficient to run this type of code for each year (2015 & 2016). Is there a better way? I don't think 'margins' will work because there are multiple subtotals.
df[('2015','2015_Total')] = df[('2015','1st')]+df[('2015','2nd')]
For the subtotal rows (e.g., fruits total and animal total), I'm not sure where to begin.
It is very complicated, because you need create Multiindex
in columns
and index
.
Create subtotals is easy - use groupby
with sum
. Then create Multiindex and last concat
new columns to original DataFrame
. Last you have to sort_index
(I add Total_
before value for correct sorting):
print df 2015_____ 2016_______ 1st 2nd 1st 2nd Fruits Apple 10 9 11 10 Banana 20 22 21 20 Animal Lion 5 3 2 1 Tiger 2 3 5 0 df1 = df.groupby(level=0, axis=1).sum() print df1 2015_____ 2016_______ Fruits Apple 19 21 Banana 42 41 Animal Lion 8 3 Tiger 5 5 print df.columns.get_level_values(0).to_series().drop_duplicates().tolist() ['2015_____', '2016_______'] #change index to multiindex new_columns = zip(df.columns.get_level_values(0).to_series().drop_duplicates().tolist(), "Total_" + df1.columns.str[:4]) print new_columns [('2015_____', 'Total_2015'), ('2016_______', 'Total_2016')] df1.columns = pd.MultiIndex.from_tuples(new_columns) print df1 2015_____ 2016_______ Total_2015 Total_2016 Fruits Apple 19 21 Banana 42 41 Animal Lion 8 3 Tiger 5 5 df = pd.concat([df,df1], axis=1)
df2 = df.groupby(level=0, sort=False).sum() print df2 2015_____ 2016_______ 2015_____ 2016_______ 1st 2nd 1st 2nd Total_2015 Total_2016 Animal 7 6 7 1 13 8 Fruits 30 31 32 30 61 62 print df.index.levels[0][df.columns.labels[0]].to_series().drop_duplicates().tolist() ['Animal', 'Fruits'] #change index to multiindex new_idx=zip(df.index.levels[0][df.columns.labels[0]].to_series().drop_duplicates().tolist(), "Total_" + df2.index ) print new_idx [('Animal', 'Total_Animal'), ('Fruits', 'Total_Fruits')] df2.index = pd.MultiIndex.from_tuples(new_idx) print df2 2015_____ 2016_______ 2015_____ 2016_______ 1st 2nd 1st 2nd Total_2015 Total_2016 Animal Total_Animal 7 6 7 1 13 8 Fruits Total_Fruits 30 31 32 30 61 62 df = pd.concat([df,df2]) df = df.sort_index(axis=1).sort_index() print df 2015_____ 2016_______ 1st 2nd Total_2015 1st 2nd Total_2016 Animal Lion 5 3 8 2 1 3 Tiger 2 3 5 5 0 5 Total_Animal 7 6 13 7 1 8 Fruits Apple 10 9 19 11 10 21 Banana 20 22 42 21 20 41 Total_Fruits 30 31 61 32 30 62