Getting the closest string match

后端 未结 13 711
难免孤独
难免孤独 2020-11-22 10:57

I need a way to compare multiple strings to a test string and return the string that closely resembles it:

TEST STRING: THE BROWN FOX JUMPED OVER THE RED COW         


        
相关标签:
13条回答
  • 2020-11-22 11:36

    If you're doing this in the context of a search engine or frontend against a database, you might consider using a tool like Apache Solr, with the ComplexPhraseQueryParser plugin. This combination allows you to search against an index of strings with the results sorted by relevance, as determined by Levenshtein distance.

    We've been using it against a large collection of artists and song titles when the incoming query may have one or more typos, and it's worked pretty well (and remarkably fast considering the collections are in the millions of strings).

    Additionally, with Solr, you can search against the index on demand via JSON, so you won't have to reinvent the solution between the different languages you're looking at.

    0 讨论(0)
提交回复
热议问题