I need to write a strict regular expression
to replace certain values in my pandas
dataframe. This is an issue that was raised after solving the questi
Add \b
for word boundaries to each key of dict
:
d = {'UK': 'United Kingdom', 'LA': 'Los Angeles', 'NYC': 'New York City', 'NY' : 'New York'}
data = {'Categories': ['animal','plant','object'],
'Type': ['tree','dog','rock'],
'Comment': ['The NYC tree is very big', 'NY The cat from the UK is small',
'The rock was found in LA.']
}
d = {r'\b' + k + r'\b':v for k, v in d.items()}
df = pd.DataFrame(data)
df['commentTest'] = df['Comment'].replace(d, regex=True)
print (df)
Categories Comment Type \
0 animal The NYC tree is very big tree
1 plant NY The cat from the UK is small dog
2 object The rock was found in LA. rock
commentTest
0 The New York City tree is very big
1 New York The cat from the United Kingdom is small
2 The rock was found in Los Angeles.