问题
Here is my issue. I have data like this:
data = {
'name': ["Jack ;; Josh ;; John", "Apple ;; Fruit ;; Pear"],
'grade': [11, 12],
'color':['black', 'blue']
}
df = pd.DataFrame(data)
It looks like:
name grade color
0 Jack ;; Josh ;; John 11 black
1 Apple ;; Fruit ;; Pear 12 blue
I want it to look like:
name age color
0 Jack 11 black
1 Josh 11 black
2 John 11 black
3 Apple 12 blue
4 Fruit 12 blue
5 Pear 12 blue
So first I'd need to split name by using ";;" and then explode that list into different rows
回答1:
Use Series.str.split with reshape by DataFrame.stack and add orriginal another columns by DataFrame.join:
c = df.columns
s = (df.pop('name')
.str.split(' ;; ', expand=True)
.stack()
.reset_index(level=1, drop=True)
.rename('name'))
df = df.join(s).reset_index(drop=True).reindex(columns=c)
print (df)
name grade color
0 Jack 11 black
1 Josh 11 black
2 John 11 black
3 Apple 12 blue
4 Fruit 12 blue
5 Pear 12 blue
回答2:
You have 2 challenges:
split the name with ;; into a list AND have each item in the list as a column such that:
df['name']=df.name.str.split(';;') df_temp = df.name.apply(pd.Series) df = pd.concat([df[:], df_temp[:]], axis=1) df.drop('name', inplace=True, axis=1)
result:
grade color 0 1 2
0 11 black Jack Josh John
1 12 blue Apple Fruit Pear
Melt the list to get desired result:
df.melt(id_vars=["grade", "color"], value_name="Name").sort_values('grade').drop('variable', axis=1)
desired result:
grade color Name
0 11 black Jack
2 11 black Josh
4 11 black John
1 12 blue Apple
3 12 blue Fruit
5 12 blue Pear
来源:https://stackoverflow.com/questions/56924642/how-to-split-pandas-string-column-into-different-rows