I have a list of strings and I want to keep only the most unique strings. Here is how I have implemented this (maybe there\'s an issue with the loop),
def filter
The Problem with your logic is that each time when you delete an item from the array, the index gets re-arranged and skips a string in between. Eg:
Assume that this is the array: Description : ["A","A","A","B","C"]
iterartion 1:
i=0 -------------0
description[i]="A"
j=i+1 -------------1
description[j]="A"
similarity_ratio>0.6
del description[j]
Now the array is re-indexed like: Description:["A","A","B","C"]. The next step is:
j=j+1 ------------1+1= 2
Description[2]="B"
You have skipped Description[1]="A"
To fix this : Replace
j+=1
With
j=i+1
if deleted. Else do the normal j=j+1 iteration