问题
For a dataframe like this:
d = {'id': [1,1,1,2,2], 'Month':[1,2,3,1,3],'Value':[12,23,15,45,34], 'Cost':[124,214,1234,1324,234]}
df = pd.DataFrame(d)
Cost Month Value id
0 124 1 12 1
1 214 2 23 1
2 1234 3 15 1
3 1324 1 45 2
4 234 3 34 2
to which I apply pivot_table
df2 = pd.pivot_table(df,
values=['Value','Cost'],
index=['id'],
columns=['Month'],
aggfunc=np.sum,
fill_value=0)
to get df2:
Cost Value
Month 1 2 3 1 2 3
id
1 124 214 1234 12 23 15
2 1324 0 234 45 0 34
is there an easy way to format resulting dataframe column names like
id Cost1 Cost2 Cost3 Value1 Value2 Value3
1 124 214 1234 12 23 15
2 1324 0 234 45 0 34
If I do:
df2.columns =[s1 + str(s2) for (s1,s2) in df2.columns.tolist()]
I get:
Cost1 Cost2 Cost3 Value1 Value2 Value3
id
1 124 214 1234 12 23 15
2 1324 0 234 45 0 34
How to get rid of the extra level?
thanks!
回答1:
Using clues from @chrisb's answer, this gave me exactly what I was after:
df2.reset_index(inplace=True)
which gives:
id Cost1 Cost2 Cost3 Value1 Value2 Value3
1 124 214 1234 12 23 15
2 1324 0 234 45 0 34
and in case of multiple index columns, this post explains it well. just to be complete, here is how:
df2.columns = [' '.join(col).strip() for col in df2.columns.values]
回答2:
'id'
is the index name, which you can set to None
to remove.
In [35]: df2.index.name = None
In [36]: df2
Out[36]:
Cost1 Cost2 Cost3 Value1 Value2 Value3
1 124 214 1234 12 23 15
2 1324 0 234 45 0 34
来源:https://stackoverflow.com/questions/33290374/pandas-pivot-table-column-names