发表新帖

发表新帖

Find the similarity metric between two strings

前端未结

关注

 11  1811

长情又很酷 2020-11-22 13:24

How do I get the probability of a string being similar to another string in Python?

I want to get a decimal value like 0.9 (meaning 90%) etc. Preferably with standar

11条回答

盖世英雄少女心 (楼主)

2020-11-22 14:06
Solution #1: Python builtin

use SequenceMatcher from difflib

pros: native python library, no need extra package.
cons: too limited, there are so many other good algorithms for string similarity out there.
example :
```
>>> from difflib import SequenceMatcher
>>> s = SequenceMatcher(None, "abcd", "bcde")
>>> s.ratio()
0.75
```
Solution #2: jellyfish library

its a very good library with good coverage and few issues. it supports:
- Levenshtein Distance
- Damerau-Levenshtein Distance
- Jaro Distance
- Jaro-Winkler Distance
- Match Rating Approach Comparison
- Hamming Distance

pros: easy to use, gamut of supported algorithms, tested.
cons: not native library.

example:
```
>>> import jellyfish
>>> jellyfish.levenshtein_distance(u'jellyfish', u'smellyfish')
2
>>> jellyfish.jaro_distance(u'jellyfish', u'smellyfish')
0.89629629629629637
>>> jellyfish.damerau_levenshtein_distance(u'jellyfish', u'jellyfihs')
1
```
0 讨论(0)

查看其它11个回答
发布评论:

提交评论
- 加载中...

热议问题