问题
I am trying to convert some string data into columns, but have had a difficult time utilizing past responses because I do not have a unique index or multi-index that I could use.
Sample format
index location field value
1 location1 firstName A
2 location1 lastName B
3 location1 dob C
4 location1 email D
5 location1 title E
6 location1 address1 F
7 location1 address2 G
8 location1 address3 H
9 location1 firstName I
10 location1 lastName J
11 location1 dob K
12 location1 email L
13 location1 title M
14 location1 address1 N
15 location1 address2 O
16 location1 address3 P
40 location2 firstName Q
41 location2 lastName R
42 location2 dob S
43 location2 email T
44 location2 title U
45 location2 address1 V
46 location2 address2 W
47 location2 address3 X
Format I'd like to pivot to:
location firstName lastName dob email title address1 address2 address3
location1 A B C D E F G H
location1 I J K L M N O P
location2 Q R S T U V W X
The closest I've come to achieving this is by using aggfuc='first', but this I need all values for each location and not just the first.
Format I'd like to pivot to:
df = df.pivot_table(index='location',columns='field',values='value',aggfunc='first')
回答1:
You'll need to pivot with a surrogate column. Here's a solution using cumsum
+ set_index
+ unstack
.
v = df.set_index(['location', 'field', df.field.eq('firstName').cumsum()]).unstack(-2)
v.index = v.index.droplevel(-1)
v.columns = v.columns.droplevel(0)
field address1 address2 address3 dob email firstName \
location
location1 F G H C D A
location1 N O P K L I
location2 V W X S T Q
field lastName title
location
location1 B E
location1 J M
location2 R U
来源:https://stackoverflow.com/questions/48630089/pivoting-a-pandas-dataframe-no-numeric-types-index-is-not-unique