For example I have synonyms laptop,netbook,notebook in index_synonyms.txt
When user search for netbook I want to boost original text more then expanded by synonyms? Is
As far as I know, there is no way to do this with the existing SynonymFilterFactory. But following is a trick you can use to get this behavior.
Let's say your field is called title
. Create another field which is a copy of this, say title_synonyms
. Now ensure that SynonymFilterFactory is used as an analyzer only for title_synonyms
(you can do this by using different field types for the two fields — say text
and text_synonyms
). Search in both these fields but give higher boost to title
than title_synonyms
.
Here are sample field type definitions:
<fieldType name="text" class="solr.TextField">
<analyzer type="index">
<tokenizer class="solr.WhitespaceTokenizerFactory"/>
<filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.EnglishPorterFilterFactory" protected="protwords.txt"/>
<filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
</analyzer>
<analyzer type="query">
<tokenizer class="solr.WhitespaceTokenizerFactory"/>
<filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.EnglishPorterFilterFactory" protected="protwords.txt"/>
<filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
</analyzer>
</fieldType>
<fieldType name="text_synonyms" class="solr.TextField">
<analyzer type="index">
<tokenizer class="solr.WhitespaceTokenizerFactory"/>
<filter class="solr.SynonymFilterFactory" synonyms="synonyms_index.txt" ignoreCase="true" expand="true"/>
<filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.EnglishPorterFilterFactory" protected="protwords.txt"/>
<filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
</analyzer>
<analyzer type="query">
<tokenizer class="solr.WhitespaceTokenizerFactory"/>
<filter class="solr.SynonymFilterFactory" synonyms="synonyms_query.txt" ignoreCase="true" expand="true"/>
<filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.EnglishPorterFilterFactory" protected="protwords.txt"/>
<filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
</analyzer>
</fieldType>
And here are sample field definitions:
<field name="title" type="text" stored="false"
required="true" multiValued="true"/>
<field name="title_synonyms" type="text_synonyms" stored="false"
required="true" multiValued="true"/>
Copy title
field to title_synonyms
:
<copyField source="title" dest="title_synonyms"/>
If you are using dismax
, you can give different boosts to these fields like so:
<str name="qf">title^10 title_synonyms^1</str>