Pivoting a Pandas Dataframe, no numeric types, index is not unique

问题

I am trying to convert some string data into columns, but have had a difficult time utilizing past responses because I do not have a unique index or multi-index that I could use.

Sample format

  index	location	field	        value
1	location1	firstName	A
2	location1	lastName	B
3	location1	dob	        C 
4	location1	email	        D
5	location1	title	        E
6	location1	address1	F
7	location1	address2	G
8	location1	address3	H
9	location1	firstName	I
10	location1	lastName	J
11	location1	dob	        K
12	location1	email	        L
13	location1	title	        M
14	location1	address1	N
15	location1	address2	O
16	location1	address3	P
40	location2	firstName	Q
41	location2	lastName	R
42	location2	dob	        S
43	location2	email	        T
44	location2	title	        U
45	location2	address1	V
46	location2	address2	W
47	location2	address3	X

Format I'd like to pivot to:

location	firstName lastName dob email title address1	address2 address3
location1	A	B	C	D	E	F	G	H
location1	I	J	K	L	M	N	O	P
location2	Q	R	S	T	U	V	W	X

The closest I've come to achieving this is by using aggfuc='first', but this I need all values for each location and not just the first.

Format I'd like to pivot to:

df = df.pivot_table(index='location',columns='field',values='value',aggfunc='first')

回答1:

You'll need to pivot with a surrogate column. Here's a solution using cumsum + set_index + unstack.

v = df.set_index(['location', 'field', df.field.eq('firstName').cumsum()]).unstack(-2) 
v.index = v.index.droplevel(-1)
v.columns = v.columns.droplevel(0)

field     address1 address2 address3         dob      email firstName  \
location                                                                
location1        F        G        H          C           D         A   
location1        N        O        P           K          L         I   
location2        V        W        X           S          T         Q   

field     lastName      title  
location                       
location1        B          E  
location1        J          M  
location2        R          U

来源：https://stackoverflow.com/questions/48630089/pivoting-a-pandas-dataframe-no-numeric-types-index-is-not-unique

标签

python

pandas

pivot-table