Is there a faster (less precise) algorithm than Levenshtein for string distance?

前端 未结 3 1489
孤城傲影
孤城傲影 2021-01-05 17:56

I want to run the Levenshtein, but WAY faster because it\'s real time application that I\'m building. It can terminate once the distance is greater than 10.

相关标签:
3条回答
  • 2021-01-05 18:38

    The Levenshtein distance metric allows addition, deletion or substitution operations. If you're looking for a faster but less precise metric you can use the longest common subsequence (allows only addition and deletion) or even the Hamming distance (allows only substitution).

    However, I recommend that you try to optimize your Levenshtein distance algorithm instead as it gives the best results.

    0 讨论(0)
  • 2021-01-05 18:42

    If you want to compare UTF-8 contents use sift4:

    https://siderite.dev/blog/super-fast-and-accurate-string-distance.html

    Also I prepared a jsPerf which shows the performance difference between those libraries: http://jsperf.com/levenshtein-perf

    0 讨论(0)
  • 2021-01-05 18:55

    Judging from comments, people seem to be pretty happy with Sift3.

    http://sift.codeplex.com

    0 讨论(0)
提交回复
热议问题