I'm developing a set of synonyms, where you can find some multi-word expressions, such as:
black berry => blackberry
At the analysis stage, and using the /admin/analysis.jsp tools, I can see that the results are correct.
A query such as "quiero una black berry" returns the following sequence:
The org.apache.solr.analysis.StandardTokenizerFactory {luceneMatchVersion=LUCENE_36}:
position 1 2 3 4
term text quiero una black berry
startOffset 0 7 11 17
endOffset 6 10 16 22
type <ALPHANUM> <ALPHANUM> <ALPHANUM> <ALPHANUM>
The org.apache.solr.analysis.SynonymFilterFactory {synonyms=lang/synonyms_es.txt, expand=false, ignoreCase=true, luceneMatchVersion=LUCENE_36}:
position 1 2 3
term text quiero una blackberry
type <ALPHANUM> <ALPHANUM> SYNONYM
startOffset 0 7 11
endOffset 6 10 22
However, if I try this sentence at a "real" query, the request handler (an evolution of the edismax handler), the tokens "black" and "berry" were not replaced by "blackberry".
I've seen here that you can solve this situation by modifying the FieldQParser plugin.
Anyway, since such post was made almost 3 years ago, I'd like to know if there's some way of solving this problem inside Solr, avoiding having to expand some plugin.
Thanks.
Based on this link you should search for "black berry" with quotes as using it without quotes causes an OR query i.e. black OR berry
In Solr-6.5.0 you can enables query-time multi-term synonyms by setting below parameter
From Documentation
The sow Parameter
Split on whitespace: if set to false, whitespace-separated term sequences will be provided to text analysis in one shot, enabling proper function of analysis filters that operate over term sequences, e.g. multi-word synonyms and shingles. Defaults to true: text analysis is invoked separately for each individual whitespace-separated term.
[synonym.txt]
black berry => blackberry
[Example]
q=black berry
&sow=false
&debug=query
[Debug-Response]
<lst name="debug">
<str name="rawquerystring">black berry</str>
<str name="querystring">black berry</str>
<str name="parsedquery">_text_:blackberry</str>
<str name="parsedquery_toString">_text_:blackberry</str>
<str name="QParser">LuceneQParser</str>
</lst>
Now you can see from debug response that I have searched for black berry, but synonym filter factory maps to the word that I have mentioned in synonym.txt.
来源:https://stackoverflow.com/questions/11544216/solr-multi-word-synonyms