The purpose of this question is to further explore MultiIndex dataframes and to ask questions of the best approach for various tasks.
Create the DataFrame
import pandas as pd df = pd.DataFrame({'index_date' : ['12/07/2016','12/07/2016','12/07/2016','12/07/2016','12/07/2016'], 'portfolio' : ['A','B','C','D','E'], 'reporting_ccy' : ['GBP','GBP','GBP','GBP','GBP'], 'portfolio_ccy' : ['JPY','USD','USD','EUR','EUR'], 'amount' : [100,200,300,400,500], 'injection' : [1,2,3,4,5], 'to_usd' : [1.3167,1.3167,1.3167,1.3167,1.3167], 'to_ccy' : [0.009564,1,1,1.1093,1.1093], 'm5' : [2,4,6,8,10], 'm6' : [1,3,5,7,9]});
Pivot the DataFrame
df_pivot = df.pivot_table(index='index_date',columns=['portfolio','portfolio_ccy','reporting_ccy']).swaplevel(0, 1, axis=1).sortlevel(axis=1)
Rename the columns
df_pivot.columns.names = ['portfolio','measures', 'portfolio_ccy', 'reporting_ccy']
This yields a pivoted representation of the data such that:
- a portfolio may have 1 or many measures
- shows the portfolio default currency
- shows the portfolio reporting currency
- a measure may have 1 or many reporting currencies.
I terms of 4. what is the best approach for implementation given that we have the xRates for the currencies?
Such that we create a dataframe such as that derived here:
Create DataFrame
df1 = pd.DataFrame({'index_date' : ['12/07/2016','12/07/2016','12/07/2016','12/07/2016','12/07/2016'], 'portfolio' : ['A','B','C','D','E'], 'reporting_ccy' : ['JPY','USD','USD','EUR','EUR'], 'portfolio_ccy' : ['JPY','USD','USD','EUR','EUR'], 'amount' : [13767.2522, 263.34, 395.01, 474.785901, 593.4823763], 'injection' : [1,2,3,4,5], 'to_usd' : [0.009564, 1, 1, 1.1093, 1.1093], 'to_ccy' : [1.3167, 1.3167, 1.3167, 1.3167, 1.3167], 'm5' : [2,4,6,8,10], 'm6' : [1,3,5,7,9]});
Concatenate & Pivot the DataFrames
df_concat = pd.concat([df,df1]) df_pivot1 = df_concat.pivot_table(index='index_date',columns=['portfolio','portfolio_ccy','reporting_ccy']).swaplevel(0, 1, axis=1).sortlevel(axis=1) df_pivot1.columns.names = ['portfolio','measures', 'portfolio_ccy', 'reporting_ccy']
This now shows 1 measure having many currencies.
df_pivot1.xs(('amount', 'A'), level=('measures','portfolio'), drop_level=False, axis=1)
Question
Is there a better way, such as adding data directly to a multiIndexed dataframe at level 3 df_pivot1.columns.get_level_values(3).unique()
?
I would like to be able to iterate through each level and add new measures either derived from other measures using df.assign()
or other methods.
The use case here is to add other currencies to the measures where applicable. The concatenation and re-pivot as above does not seem optimal.