Training solr to recognize nicknames or name variants

被刻印的时光 ゝ 提交于 2019-12-05 06:34:49

问题


I'm pretty sure that solr can be set up to recognize synonyms during searches. I'm wondering if it's possible to do the same with nicknames -- so for example searches for "Robert" would pull up records with "Bob" in them.


回答1:


Just found a page where someone named Jon Moniaci exactly how to do this: http://bitsandpieces.jonmoniaci.com/2010/05/searching-common-nicknames-in-solr/

Basically, create a synonyms file with lines like so:

Bob, Robert, Bobby

(Jon's file is here, derived from the listing of common male and female nicknames on http://usefulenglish.ru/)

Save to english_names.txt and add the following to your solr configuration:

<fieldType name="textEnglishName" class="solr.TextField" positionIncrementGap="100">
      <analyzer type="index">
        <tokenizer class="solr.WhitespaceTokenizerFactory"/>
        <filter class="solr.WordDelimiterFilterFactory" generateWordParts="1" generateNumberParts="1" catenateWords="1" catenateNumbers="1" catenateAll="0" splitOnCaseChange="1"/>
        <filter class="solr.ASCIIFoldingFilterFactory"/>
        <filter class="solr.LowerCaseFilterFactory"/>
        <filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
      </analyzer>
      <analyzer type="query">
        <tokenizer class="solr.WhitespaceTokenizerFactory"/>
        <filter class="solr.SynonymFilterFactory" synonyms="english_names.txt" ignoreCase="true" expand="true"/>
        <filter class="solr.WordDelimiterFilterFactory" generateWordParts="1" generateNumberParts="1" catenateWords="0" catenateNumbers="0" catenateAll="0" splitOnCaseChange="1"/>
        <filter class="solr.ASCIIFoldingFilterFactory"/>
        <filter class="solr.LowerCaseFilterFactory"/>
        <filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
      </analyzer>
</fieldType>

Then designate the author field as a textEnglishName field:

<fields>
  <field name="name" type="textEnglishName" indexed="true" stored="false"/>
</fields>


来源:https://stackoverflow.com/questions/17550787/training-solr-to-recognize-nicknames-or-name-variants

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!