Solr How to search ñ and Ñ with normal char N and vice verse

…衆ロ難τιáo~ 提交于 2019-12-13 12:17:06

问题


How can we map non ASCII char with ASCII character?

Ex.: In solr index we have word contain char ñ, Ñ [LATIN CAPITAL LETTER N WITH TILDE] or normal n,N Then what filter/token we use to search with Normal N or Ñ and both mapped.


回答1:


Merging the answers of Solr, Special Chars, and Latin to Cyrilic char conversion

  1. Take a look at Solr's Analyzers, Tokenizers, and Token Filters which give you a good intro to the type of manipulation you're looking for.
  2. Probably the ASCIIFoldingFilterFactory does exactly what you want.

When changing an analyzer to remove the accents, keep in mind that you need to reindex. Otherwise the accented characters will stay within the index, but no user input can be created to match them.

Update

I tried using the ICUFoldingFilterFactory this works fine with those accents. If this one is tricky to set up, have a look into the SO question Can not use ICUTokenizerFactory in Solr

This analyzer

<fieldType name="spanish" class="solr.TextField">
    <analyzer>
        <tokenizer class="solr.StandardTokenizerFactory" />
        <filter class="solr.ICUFoldingFilterFactory" />
    </analyzer>
</fieldType>

got me these analysis results, the screen-shot is taken from solr-admin



来源:https://stackoverflow.com/questions/22714285/solr-how-to-search-%c3%b1-and-%c3%91-with-normal-char-n-and-vice-verse

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!