I try to compare each row with all rows in a pandas dataframe with fuzzywuzzy.fuzzy.partial_ratio() >= 85
and write the results in a list for each row.
The first step would be to find the indices that match the condition for a given name
. Since partial_ratio
only takes strings, we apply
it to the dataframe:
name = 'dog'
df.apply(lambda row: (partial_ratio(row['name'], name) >= 85), axis=1)
We can then use enumerate
and list comprehension to generate the list of true
indices in the boolean array:
matches = df.apply(lambda row: (partial_ratio(row['name'], name) >= 85), axis=1)
[i for i, x in enumerate(matches) if x]
Let's put all this inside a function:
def func(name):
matches = df.apply(lambda row: (partial_ratio(row['name'], name) >= 85), axis=1)
return [i for i, x in enumerate(matches) if x]
We can now apply the function to the entire dataframe:
df.apply(lambda row: func(row['name']), axis=1)