I have two dataframe df1
and df2
.
df1 = pd.DataFrame ({\'Name\': [\'Adam Smith\', \'Anne Kim\', \'John Weber\', \'Ian Ford\'],
I am using fuzzywuzzy
here
from fuzzywuzzy import fuzz
from fuzzywuzzy import process
df2['key']=df2.Name.apply(lambda x : [process.extract(x, df1.Name, limit=1)][0][0][0])
df2.merge(df1,left_on='key',right_on='Name')
Out[1238]:
Name_x gender key Age Name_y
0 adam Smith M Adam Smith 43 Adam Smith
1 Annie Kim F Anne Kim 21 Anne Kim
2 John Weber M John Weber 55 John Weber
3 Ian Ford M Ian Ford 24 Ian Ford
Not sure if fuzzy match is what you are looking for. Maybe make every name a proper name?
df1.Name = df1.Name.apply(lambda x: x.title())
df2.Name = df2.Name.apply(lambda x: x.title())
pd.merge(df1, df2, how='inner', on='Name')