I'm using the sunspot_rails gem and everything is working perfect so far but: I'm not getting any search results for words with a hyphen.
Example: The string "tron" returns a lot of results(the word mentioned in all articles is e-tron)
The string "e-tron" returns 0 results even though this is the correct word mentioned in all my articles.
My current schema.xml config:
<fieldType name="text" class="solr.TextField" omitNorms="false">
<analyzer type="index">
<tokenizer class="solr.StandardTokenizerFactory"/>
<filter class="solr.StandardFilterFactory"/>
<filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.EdgeNGramFilterFactory" minGramSize="2" maxGramSize="15" side="front"/>
</analyzer>
<analyzer type="query">
<tokenizer class="solr.StandardTokenizerFactory"/>
<filter class="solr.StandardFilterFactory"/>
<filter class="solr.LowerCaseFilterFactory"/>
</analyzer>
</fieldType>
What I want: The behaviour for the search string tron is okay of course, but I also want to have the correct matches for the search string e-tron.
The problem is that solr.StandardTokenizerFactory is splitting words by hyphens so "e-tron" generates the tokens "e", "tron". Presumably "e" is lost as solr.TextField filters with a minimum token size of 2.
This is one example that would show your specific problem.
<fieldType name="text" class="solr.TextField" omitNorms="false">
<analyzer type="index">
<tokenizer class="solr.WhitespaceTokenizerFactory"/>
<filter class="solr.WordDelimiterFilterFactory" preserveOriginal="1" />
<filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.EdgeNGramFilterFactory" minGramSize="2" maxGramSize="15" side="front"/>
</analyzer>
<analyzer type="query">
<tokenizer class="solr.WhitespaceTokenizerFactory"/>
<filter class="solr.WordDelimiterFilterFactory" preserveOriginal="1" />
<filter class="solr.LowerCaseFilterFactory"/>
</analyzer>
</fieldType>
solr.WhitespaceTokenizerFactory
will generate tokens on whitespace.["e-tron"]
solr.WordDelimiterFilterFactory
will split on hyphens but also preserve the original word.["e", "tron", "e-tron"]
来源:https://stackoverflow.com/questions/17225344/rails-sunspot-solr-words-with-hyphen