levenshtein alternative

后端未结

关注

 1  506

i have a big set of queries and use levenshtein to calculate typos, now levenshtein causes mysql to take full cpu time. My query is a fulltext search + levenshtein in a UNIO

相关标签:

1条回答

灰色年华

2021-01-12 10:49

If you are tied only to MySQL there is not an easy solution.

Usually this is solved using specialized ngram indexing for fast candidate lookup filtering and then calculating levensthein only on like 10-50 candidates which is faster that calculating levensthein for all pairs.

Specialized fulltext search engines like Solr/Lucene have this built in.

PostgreSQL has pg_trgm contrib module (http://www.postgresql.org/docs/9.0/static/pgtrgm.html) which works like a charm.

You can even simulate this in MySQL using fulltext indexing, but you have to collect words from all your documents convert them to ngrams, create fulltext indexes on them, and hack them all together for fast lookup. Which brings all sorts of trouble with redundancy, sync...not worth your time.

0 讨论(0)
发布评论:

提交评论
- 加载中...