How to correct the user input (Kind of google “did you mean?”)

后端 未结 8 1177
粉色の甜心
粉色の甜心 2021-01-30 23:50

I have the following requirement: -

I have many (say 1 million) values (names). The user will type a search string.

I don\'t expect the user to spell the names c

相关标签:
8条回答
  • 2021-01-31 00:31

    the Bitap Algorithm is designed to find an approximate match in a body of text. Maybe you could use that to calculate probable matches. (it's based on the Levenshtein Distance)

    (Update: after having read Ben S answer (use an existing solution, possibly aspell) is the way to go)


    As others said, Google does auto correction by watching users correct themselves. If I search for "someting" (sic) and then immediately for "something" it is very likely that the first query was incorrect. A possible heuristic to detect this would be:

    • If a user has done two searches in a short time window, and
    • the first query did not yield any results (or the user did not click on anything)
    • the second query did yield useful results
    • the two queries are similar (have a small Levenshtein distance)

    then the second query is a possible refinement of the first query which you can store and present to other users.

    Note that you probably need a lot of queries to gather enough data for these suggestions to be useful.

    0 讨论(0)
  • 2021-01-31 00:33

    I would consider using a pre-existing solution for this.

    Aspell with a custom dictionary of the names might be well suited for this. Generating the dictionary file will pre-compute all the information required to quickly give suggestions.

    0 讨论(0)
提交回复
热议问题