create new column in dataframe using fuzzywuzzy

半腔热情 提交于 2019-12-19 21:17:34

问题


I have a dataframe in pandas where I am using fuzzywuzzy package in python to match first column in the dataframe with second column.

I have defined a function to create an output with first column, second column and partial ratio score. But it is not working.

Could you please help

import csv
import sys
import os
import numpy as np
import pandas as pd
from fuzzywuzzy import fuzz
from fuzzywuzzy import process

def match(driver):
    driver["score"]=driver.apply(lambda row: fuzz.partial_ratio(row driver[driver.columns[0]], driver[driver.columns[1]]), axis=1)
    print(driver)
    return(driver)

Regards

-Abacus


回答1:


You're passed a Series to work with inside the apply function, representing the current row here. In your code, you're effectively ignoring this Series and trying to call partial_ratio with the two whole columns of the DataFrame each time (driver[col]).

A minor change to your code should hopefully give you what you want.

d = DataFrame({'one': ['fuzz', 'wuzz'], 'two': ['fizz', 'woo']})

d.apply(lambda s: fuzz.partial_ratio(s['one'], s['two']), axis=1)

0    75
1    33
dtype: int64

(Interestingly, the partial_ratio function will accept a Series as input, but only because it converts it internally into a string. :)



来源:https://stackoverflow.com/questions/36138886/create-new-column-in-dataframe-using-fuzzywuzzy

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!