For example we have a csv file with
name age address
john 25 koramangala banglore #@ sales maneger
Passing regular expression in the sep
of read_csv
import io
t = """name ,age , address
john,25,koramangala banglore +ACMAQA- sales maneger +ACUAJA- india
harshuth rao ,36,belandur banglore +ACMAQA- maneger +ACUAJA- india
vijay kumar,45,ulsoor banglore +ACMAQA- sales maneger +ACUAJA- india
suhas,25,koramangala banglore +ACMAQA-analist +ACUAJA- india
mithun,22,venkatapura banglore +ACMAQA- execitive +ACUAJA- india"""
df = pd.read_csv(io.StringIO(t),
sep='\s*\+ACMAQA-\s*|\s*\+ACUAJA-\s*|\s*,\s*', engine='python')
df = df.reset_index()
df.columns = ["name", "age", "city", "position", "country"]
name age city position country
0 john 25 koramangala banglore sales maneger india
1 harshuth rao 36 belandur banglore maneger india
2 vijay kumar 45 ulsoor banglore sales maneger india
3 suhas 25 koramangala banglore analist india
4 mithun 22 venkatapura banglore execitive india
First, load your data using pd.read_csv
:
import pandas as pd
df = pd.read_csv("/home/vipul/Desktop/example.csv", sep=',')
print(df)
name age address
0 john 25 koramangala banglore +ACMAQA- sales maneger +A...
1 harshuth rao 36 belandur banglore +ACMAQA- maneger +ACUAJA- i...
2 vijay kumar 45 ulsoor banglore +ACMAQA- sales maneger +ACUAJA...
3 suhas 25 koramangala banglore +ACMAQA-analist +ACUAJA- ...
4 mithun 22 venkatapura banglore +ACMAQA- execitive +ACUAJ...
Next, use str.split
to separate the data + pd.concat
to join with the original:
v = df.pop('address').str.split('\s*\+.*?-\s*', expand=True)
v.columns = ['city', 'position', 'country']
df = pd.concat([df, v], 1)
print(df)
name age city position country
0 john 25 koramangala banglore sales maneger india
1 harshuth rao 36 belandur banglore maneger india
2 vijay kumar 45 ulsoor banglore sales maneger india
3 suhas 25 koramangala banglore analist india
4 mithun 22 venkatapura banglore execitive india
Finally, save to CSV:
df.to_csv("/home/vipul/Desktop/new.csv")