how to implement wildcard search with sunspot

老子叫甜甜 提交于 2019-12-14 01:55:19

问题


any help is always welcome I am using sunspot with solr but not able to find any good solution that how to perform wildcard search with sunspot

if i search for 8088***

it should return all numbers starts with 8088 but not 228088560


回答1:


Look for the following lines of code in /solr/conf/schema.xml:

<fieldType name="text" class="solr.TextField" omitNorms="false">
    ...
</fieldType>

and replace them with this:

<fieldType name="text" class="solr.TextField" omitNorms="false">
    <analyzer type="index">
        <tokenizer class="solr.StandardTokenizerFactory"/>
        <filter class="solr.StandardFilterFactory"/>
        <filter class="solr.LowerCaseFilterFactory"/>
        <filter class="solr.EdgeNGramFilterFactory" minGramSize="1" maxGramSize="20" side="front" />
    </analyzer>
    <analyzer type="query">
        <tokenizer class="solr.StandardTokenizerFactory"/>
        <filter class="solr.StandardFilterFactory"/>
        <filter class="solr.LowerCaseFilterFactory"/>
    </analyzer>
</fieldType>

Remember to restart the solr server, and reindex after these changes

rake sunspot:solr:stop
rake sunspot:solr:start
rake sunspot:reindex



回答2:


Sunspot gives you wildcard for free* with NGramToeknizer(there are sometimes NGramTokenizer issues for subsets that are too small and other quirks), which means that exclusion is actually the tricky part. If you know the number of digits in the number (say 6), a crude, but effective, way to handle this would be to use without (:field).greater_than(808900) without (:field).less_than(808700) <-- I don't remember whether .greater_than and .less_than are actually => and =< , so if they are just > and < you may want to do 808899 and 808800 instead, but you get the idea.

**Correction There is a solution for this: you can change the NGramFilterFactory in your solr/config/schema.xml to an EdgeNGramFilterFactory (assuming you had an NGramFilterFactory in the first place to get the partial-word seaching). This makes the index only break up words starting at the beginning of strings. After this, restart your server and reindex.

***All credit to Zach Moazeni at Collective Idea for this



来源:https://stackoverflow.com/questions/8354062/how-to-implement-wildcard-search-with-sunspot

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!