Is there way to boost original term more while using Solr synonyms?

烂漫一生 提交于 2019-12-03 12:16:48

问题


For example I have synonyms laptop,netbook,notebook in index_synonyms.txt

When user search for netbook I want to boost original text more then expanded by synonyms? Is there way to specify this in SynonymFilterFactory? For example use original term twice so his TF will be bigger


回答1:


As far as I know, there is no way to do this with the existing SynonymFilterFactory. But following is a trick you can use to get this behavior.

Let's say your field is called title. Create another field which is a copy of this, say title_synonyms. Now ensure that SynonymFilterFactory is used as an analyzer only for title_synonyms (you can do this by using different field types for the two fields — say text and text_synonyms). Search in both these fields but give higher boost to title than title_synonyms.

Here are sample field type definitions:

    <fieldType name="text" class="solr.TextField">
        <analyzer type="index">
            <tokenizer class="solr.WhitespaceTokenizerFactory"/>
            <filter class="solr.LowerCaseFilterFactory"/>
            <filter class="solr.EnglishPorterFilterFactory" protected="protwords.txt"/>
            <filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
        </analyzer>
        <analyzer type="query">
            <tokenizer class="solr.WhitespaceTokenizerFactory"/>
            <filter class="solr.LowerCaseFilterFactory"/>
            <filter class="solr.EnglishPorterFilterFactory" protected="protwords.txt"/>
            <filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
        </analyzer>
    </fieldType>

    <fieldType name="text_synonyms" class="solr.TextField">
        <analyzer type="index">
            <tokenizer class="solr.WhitespaceTokenizerFactory"/>
            <filter class="solr.SynonymFilterFactory" synonyms="synonyms_index.txt" ignoreCase="true" expand="true"/>
            <filter class="solr.LowerCaseFilterFactory"/>
            <filter class="solr.EnglishPorterFilterFactory" protected="protwords.txt"/>
            <filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
        </analyzer>
        <analyzer type="query">
            <tokenizer class="solr.WhitespaceTokenizerFactory"/>
            <filter class="solr.SynonymFilterFactory" synonyms="synonyms_query.txt" ignoreCase="true" expand="true"/>
            <filter class="solr.LowerCaseFilterFactory"/>
            <filter class="solr.EnglishPorterFilterFactory" protected="protwords.txt"/>
            <filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
        </analyzer>
    </fieldType>

And here are sample field definitions:

    <field name="title" type="text" stored="false"
           required="true" multiValued="true"/>
    <field name="title_synonyms" type="text_synonyms" stored="false"
           required="true" multiValued="true"/>

Copy title field to title_synonyms:

<copyField source="title" dest="title_synonyms"/>

If you are using dismax, you can give different boosts to these fields like so:

    <str name="qf">title^10 title_synonyms^1</str>


来源:https://stackoverflow.com/questions/10660632/is-there-way-to-boost-original-term-more-while-using-solr-synonyms

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!