问题
Getting started with Pandas.
I have two columns:
A B
Something Something Else
Everything Evythn
Someone Cat
Everyone Evr1
I want to calculate fuzz ratio for each row between the two columns so the output would be something like this:
A B Ratio
Something Something Else 12
Everything Evythn 14
Someone Cat 10
Everyone Evr1 20
How would I be able to accomplish this? Both the columns are in the same df.
回答1:
Use lambda function with DataFrame.apply:
from fuzzywuzzy import fuzz
df['Ratio'] = df.apply(lambda x: fuzz.ratio(x.A, x.B), axis=1)
#alternative with list comprehension
#df['Ratio'] = [fuzz.ratio(a, b) for a,b in zip(df.A, df.B)]
print (df)
A B Ratio
0 Something Something Else 78
1 Everything Evythn 75
2 Someone Cat 0
3 Everyone Evr1 50
EDIT:
If possible some missing values in columns it failed, so added DataFrame.dropna:
print (df)
A B
0 Something Something Else
1 Everything NaN
2 Someone Cat
3 Everyone Evr1
from fuzzywuzzy import fuzz
df['Ratio'] = df.dropna(subset=['A', 'B']).apply(lambda x: fuzz.ratio(x.A, x.B), axis=1)
print (df)
A B Ratio
0 Something Something Else 78.0
1 Everything NaN NaN
2 Someone Cat 0.0
3 Everyone Evr1 50.0
来源:https://stackoverflow.com/questions/59631258/how-do-i-calculate-fuzz-ratio-between-two-columns