问题
Consider a pandas dataframe with hierarchical columns and rows generated as follows:
import pandas as pd
import numpy as np
row1 = ['a', 'b']
row2 = ['c', 'd', 'e']
row3 = ['w', 'x', 'y', 'z']
row_tuple_list = []
for r1 in row1:
for r2 in row2:
for r3 in row3:
row_tuple_list.append((r1, r2, r3))
row_index = pd.MultiIndex.from_tuples(row_tuple_list, names=['row1', 'row2', 'row3'])
col1 = ['f']
col2 = ['g', 'h']
col_tuple_list = []
for c1 in col1:
for c2 in col2:
col_tuple_list.append((c1, c2))
col_index = pd.MultiIndex.from_tuples(col_tuple_list, names=['col1', 'col2'])
df = pd.DataFrame(np.random.randn(24,2), index=row_index, columns=col_index)
print(df.head(10))
which gives rise to the following output:
col1 f
col2 g h
row1 row2 row3
a c w 0.675077 -0.409322
x -1.317074 0.951411
y -2.430066 1.457128
z -0.852241 1.015650
d w -0.302398 -0.303503
x 0.050624 2.195313
y -0.177186 -0.126222
z 0.302745 1.186148
e w -0.928050 -0.681644
x -1.746241 0.687433
Now, imagine that a third column is to be added which has one data point per row2 entry, creating, for example:
col1 f
col2 g h
col3 i
row1 row2 row3
a c w 0.675077 -0.409322 0.273493
x -1.317074 0.951411
y -2.430066 1.457128
z -0.852241 1.015650
d w -0.302398 -0.303503 -0.320943
x 0.050624 2.195313
y -0.177186 -0.126222
z 0.302745 1.186148
e w -0.928050 -0.681644 1.380933
x -1.746241 0.687433
Is it possible to create such a "multiple data rate" dataframe? If so, how?
来源:https://stackoverflow.com/questions/43664161/pandas-dataframe-columns-with-different-data-rates-for-hierarchical-columns-a