Searching names with Apache Solr

后端 未结 5 1245
抹茶落季
抹茶落季 2020-12-12 21:20

I\'ve just ventured into the seemingly simple but extremely complex world of searching. For an application, I am required to build a search mechanism for searching users by

5条回答
  •  时光说笑
    2020-12-12 21:49

    It sounds like you are catering for a corpus with searches that you need to match very loosely?

    If you are doing that you will want to choose your fields and set different boosts to rank your results.

    So have separate "copied" fields in solr:

    • one field for exact full name (with filters)
    • multivalued field with filters ASCIIFolding, Lowercase...
    • multivalued field with the SynonymFilterFactory ASCIIFolding, Lowercase...
    • PhoneticFilterFactory (with Caverphone or Double-Metaphone)

    See Also: more non-english Soundex discussion

    Synonyms for names, I don't know if there is a public synonym db available.

    Fuzzy searching, I've not found it useful, it uses Levenshtein Distance.

    Other filters and indexing get more superior "search relevant" results.

    Unicode characters in names can be handled with the ASCIIFoldingFilterFactory

    You are describing solutions up front for expected use cases.

    If you want quality results, plan on tuning your Search Relevance

    This tuning will be especially valuable, when attempting to match on synonyms, like MacDonald and McDonald (which has a larger Levenshtein distance than Carl and Karl).

提交回复
热议问题