How do I calculate fuzz ratio between two columns?

不想你离开。 提交于 2020-06-09 04:14:04

问题


Getting started with Pandas.

I have two columns:
A                     B
Something             Something Else
Everything            Evythn
Someone               Cat
Everyone              Evr1

I want to calculate fuzz ratio for each row between the two columns so the output would be something like this:

A                     B                  Ratio
Something             Something Else     12
Everything            Evythn             14
Someone               Cat                10
Everyone              Evr1               20

How would I be able to accomplish this? Both the columns are in the same df.


回答1:


Use lambda function with DataFrame.apply:

from fuzzywuzzy import fuzz

df['Ratio'] = df.apply(lambda x: fuzz.ratio(x.A, x.B), axis=1)
#alternative  with list comprehension
#df['Ratio'] = [fuzz.ratio(a, b) for a,b in zip(df.A, df.B)]
print (df)
            A               B  Ratio
0   Something  Something Else     78
1  Everything          Evythn     75
2     Someone             Cat      0
3    Everyone            Evr1     50

EDIT:

If possible some missing values in columns it failed, so added DataFrame.dropna:

print (df)
            A               B
0   Something  Something Else
1  Everything             NaN
2     Someone             Cat
3    Everyone            Evr1

from fuzzywuzzy import fuzz

df['Ratio'] = df.dropna(subset=['A', 'B']).apply(lambda x: fuzz.ratio(x.A, x.B), axis=1)
print (df)
            A               B  Ratio
0   Something  Something Else   78.0
1  Everything             NaN    NaN
2     Someone             Cat    0.0
3    Everyone            Evr1   50.0


来源:https://stackoverflow.com/questions/59631258/how-do-i-calculate-fuzz-ratio-between-two-columns

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!