Elasticsearch not returning singular/plural matches

后端 未结 3 1020
名媛妹妹
名媛妹妹 2021-02-02 03:01

I am using a php library of elasticsearch to index and find documents in my website. This is the code for creating the index:

curl -XPUT \'http://localhost:9200/         


        
相关标签:
3条回答
  • 2021-02-02 03:19

    Since 'porterStem' filter is oversensitive, it is more suited if you use 'minimal_english' filter. 'porterStem' creates similar tokens for words such as :

    searching for 'Test' will result you 'Test', 'Tests', 'Testing', 'Tester' et. al.

    But 'minimal_english' will only yield - 'Test' and 'Tests'.

    0 讨论(0)
  • 2021-02-02 03:21

    The default elascticsearch analyzer doesn't do stemming and this is what you need to handle plural/singular. You can try using Snowball Analyzer for your text fields to see if it works better for your use case:

    curl -XPUT 'http://localhost:9200/test' -d '{
        "settings" : {
            "index" : {
                "number_of_shards" : 1,
                "number_of_replicas" : 1
            }
        },
        "mappings" : {
            "page" : {
                "properties" : {
                    "mytextfield": { "type": "string",  "analyzer": "snowball", "store": "yes"}
                }
            }
        }
    }'
    
    0 讨论(0)
  • 2021-02-02 03:33

    Somehow snowball is not working for me... am getting errors like I mentioned in the comment to @imotov's answer. I used porter stem and it worked perfectly for me. This is the config I used:

    curl -XPUT localhost:9200/index_name -d '
    {
    "settings" : {
        "analysis" : {
            "analyzer" : {
                "stem" : {
                    "tokenizer" : "standard",
                    "filter" : ["standard", "lowercase", "stop", "porter_stem"]
                }
            }
        }
    },
    "mappings" : {
        "index_type_1" : {
            "dynamic" : true,
            "properties" : {
                "field1" : {
                    "type" : "string",
                    "analyzer" : "stem"
                },
                "field2" : {
                    "type" : "string",
                    "analyzer" : "stem"
                }
             }
          }
       }
    }'
    
    0 讨论(0)
提交回复
热议问题