Splitting a column in dataframe using str.split function

你。 提交于 2021-01-27 14:26:04

问题


I am trying to split a column with comma delimited values into 2 columns but the str.split function returns columns with 0's and 1's instead of the split string values

I have a dataframe with a column 'Full Name' which has a full name with a comma separating last name from first name.

I used the str.split function which works when I execute it for display only. But: when I try to use the same function to add 2 new columns to the same dataframe with the split data, I get 2 new columns with the first populated with 0's and the second with 1's all the way.

The code that works to display the split data:

df2015_2019.iloc[:,0].str.split(',', expand=True)

Code that doesn't work to create new columns with split data:

df2015_2019['Lname'],df2015_2019['Fname'] = df2015_2019.iloc[:,0].str.split(',', expand=True)

I get a column 'Lname' with all 0's and a column 'Fname' with all 1's


回答1:


Another way around to achieve this as follows..

Example DatatSet:

>>> df = pd.DataFrame({'Name': ['Karn,Kumar', 'John,Jimlory']})
>>> df
           Name
0    Karn,Kumar
1  John,Jimlory

Result:

You can assign the column name while splitting the values as below.

>>> df[['First Name','Last Name']] = df['Name'].str.split(",", expand=True)
>>> df
           Name First Name Last Name
0    Karn,Kumar       Karn     Kumar
1  John,Jimlory       John   Jimlory

Or, as another answer stated..

>>> df['Name'].str.split(",", expand=True).rename({0: 'First_Name', 1: 'Second_Name'}, axis=1)
  First_Name Second_Name
0       Karn       Kumar
1       John     Jimlory

OR

>>> df['Name'].str.rsplit(",", expand=True).rename(columns={0:'Fist_Name', 1:'Last_Name'})
  Fist_Name Last_Name
0      Karn     Kumar
1      John   Jimlory

Note: you can use axis = columns or axis =1 both are same.

Just another way using Series.str.partition with little altercation, However, we have to use drop as partition preserves the comma "," as well as a column.

>>> df['Name'].str.partition(",", True).rename(columns={0:'Fist_Name', 2:'Last_Name'}).drop(columns =[1])
  Fist_Name Last_Name
0      Karn     Kumar
1      John   Jimlory

Just make it slim, we can define dict values for the rename.

1 - using str.partition ..

dict = {0: 'First_Name', 2: 'Second_Name'}

df = df['Name'].str.partition(",", True).rename(dict2,axis=1).drop(columns =[1])
print(df)

  First_Name Second_Name
0       Karn       Kumar
1       John     Jimlory

2 - using str.split() ..

dict = {0: 'First_Name', 1: 'Second_Name'}

df = df['Name'].str.split(",", expand=True).rename(dict, axis=1)
 print(df)
  First_Name Second_Name
0       Karn       Kumar
1       John     Jimlory



回答2:


You can rename the column after the split:

df = pd.DataFrame({'a': ['a,b', 'c,d']})
df['a'].str.split(',', expand=True).rename({0: 'Lname', 1: 'Fname'}, axis='columns')

This prints:

  Lname Fname
0     a     b
1     c     d



回答3:


The pandas.Series.str accessor can be assigned to the columns.

  1. split first (optionally, use n=1) to keep exactly one split.
  2. use another str

df['Lname'], df['Fname'] = df['Name'].str.split(',').str


来源:https://stackoverflow.com/questions/57463127/splitting-a-column-in-dataframe-using-str-split-function

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!