问题
I have a dataframe which I want to melt the data into multiple target columns. The below code I used
grp2 = pd.lreshape(grp1, cols.groupby(cols.str.split('_').str[1])).sort_values('ACCT_NAME')
The above line I lose the column names
grp2 = pd.melt(grp1 , id_vars = ['Client' , 'Industry'] , var_name = "H Year" , value_name = 'Count')
The above line I dont get multiple target columns
From DF
Client INDUSTRY 1H2016_6MO 2H2016_6MO 1H2017_6MO 2H2017_6MO 1H2016_12MO 2H2016_12MO 1H2017_12MO 2H2017_12MO
XXX AAA 1 0 0 0 1 1 0 0
YYY BBB 0 0 1 0 0 0 0 1
ZZZ CCC 1 1 0 0 0 0 1 1
XXX AAA 1 0 0 0 1 1 0 0
TO DF
Client INDUSTRY Year_Half 6MO 12MO
XXX AAA 1H2016 2 2
XXX AAA 2H2016 0 2
XXX AAA 1H2017 0 0
XXX AAA 2H2017 0 0
YYY BBB 1H2016 0 0
YYY BBB 2H2016 0 0
YYY BBB 1H2017 1 0
YYY BBB 2H2017 0 1
ZZZ CCC 1H2016 1 0
ZZZ CCC 2H2016 1 0
ZZZ CCC 1H2017 0 1
ZZZ CCC 2H2017 0 1
Please advise on the solution to this. I have seen other question but they dont take the column name into seperate columns
回答1:
Use:
- set_index for separate columns
- create MultiIndex by split
- reshape by stack
df = df.set_index(['Client','INDUSTRY'])
df.columns = df.columns.str.split('_', expand=True)
df = df.stack(0).reset_index().rename(columns={'level_2':'Year_Half'})
print (df)
Client INDUSTRY Year_Half 12MO 6MO
0 XXX AAA 1H2016 1 1
1 XXX AAA 1H2017 0 0
2 XXX AAA 2H2016 1 0
3 XXX AAA 2H2017 0 0
4 YYY BBB 1H2016 0 0
5 YYY BBB 1H2017 0 1
6 YYY BBB 2H2016 0 0
7 YYY BBB 2H2017 1 0
8 ZZZ CCC 1H2016 0 1
9 ZZZ CCC 1H2017 1 0
10 ZZZ CCC 2H2016 0 1
11 ZZZ CCC 2H2017 1 0
12 XXX AAA 1H2016 1 1
13 XXX AAA 1H2017 0 0
14 XXX AAA 2H2016 1 0
15 XXX AAA 2H2017 0 0
If only 6MO
and 12MO
values and ordering of columns is important:
df = df.set_index(['Client','INDUSTRY'])
df.columns = df.columns.str.split('_', expand=True)
df = (df.stack(0)
.reindex_axis(['6MO','12MO'], 1)
.reset_index()
.rename(columns={'level_2':'Year_Half'}))
print (df)
Client INDUSTRY Year_Half 6MO 12MO
0 XXX AAA 1H2016 1 1
1 XXX AAA 1H2017 0 0
2 XXX AAA 2H2016 0 1
3 XXX AAA 2H2017 0 0
4 YYY BBB 1H2016 0 0
5 YYY BBB 1H2017 1 0
6 YYY BBB 2H2016 0 0
7 YYY BBB 2H2017 0 1
8 ZZZ CCC 1H2016 1 0
9 ZZZ CCC 1H2017 0 1
10 ZZZ CCC 2H2016 1 0
11 ZZZ CCC 2H2017 0 1
12 XXX AAA 1H2016 1 1
13 XXX AAA 1H2017 0 0
14 XXX AAA 2H2016 0 1
15 XXX AAA 2H2017 0 0
来源:https://stackoverflow.com/questions/46234549/python-pandas-melting-data-to-multiple-columns-and-coulmn-names-in-another-colum