发表新帖

发表新帖

Alternative to Levenshtein and Trigram

前端未结

关注

 6  846

春和景丽 2021-02-07 09:48

Say I have the following two strings in my database:

(1) \'Levi Watkins Learning Center - Alabama State University\'
(2) \'ETH Library\'

My sof

6条回答

被撕碎了的回忆 (楼主)

2021-02-07 10:37

You can try to use normalized levenshtein distance:

Li Yujian, Liu Bo, "A Normalized Levenshtein Distance Metric," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 29, no. 6, pp. 1091-1095, June 2007, doi:10.1109/TPAMI.2007.1078 http://www.computer.org/csdl/trans/tp/2007/06/i1091-abs.html

They propose to normalize the levenshtein distance. By doing this, a difference of one character in a sequences of longer two weights more than the same difference when comparing sequences of longer 10.

0 讨论(0)

查看其它6个回答
发布评论:

提交评论
- 加载中...

热议问题