I have a pandas dataframe like the following:
A B
US,65,AMAZON 2016
US,65,EBAY 2016
My goal is to get to look like this:
For getting the new columns I would prefer doing it as following:
df['Country'] = df['A'].apply(lambda x: x[0])
df['Code'] = df['A'].apply(lambda x: x[1])
df['Com'] = df['A'].apply(lambda x: x[2])
As for the replacement of , with a . you can use the following:
df['A'] = df['A'].str.replace(',','.')
This will not give the output as expected it will only give the df['A'] first value which is 'U'
This is okay to create column based on provided data df1=pd.DataFrame([x.split(',') for x in df['A'].tolist()],columns= ['country','code','com'])
instead of for lambda also can be use
You can use split with parameter expand=True
and add one []
to left side:
df[['country','code','com']] = df.A.str.split(',', expand=True)
Then replace ,
to .
:
df.A = df.A.str.replace(',','.')
print (df)
A B country code com
0 US.65.AMAZON 2016 US 65 AMAZON
1 US.65.EBAY 2016 US 65 EBAY
Another solution with DataFrame
constructor if there are no NaN
values:
df[['country','code','com']] = pd.DataFrame([ x.split(',') for x in df['A'].tolist() ])
df.A = df.A.str.replace(',','.')
print (df)
A B country code com
0 US.65.AMAZON 2016 US 65 AMAZON
1 US.65.EBAY 2016 US 65 EBAY
Also you can use column names in constructor, but then concat is necessary:
df1=pd.DataFrame([x.split(',') for x in df['A'].tolist()],columns= ['country','code','com'])
df.A = df.A.str.replace(',','.')
df = pd.concat([df, df1], axis=1)
print (df)
A B country code com
0 US.65.AMAZON 2016 US 65 AMAZON
1 US.65.EBAY 2016 US 65 EBAY