ElasticSearch - fuzzyQuery Java API response are almost same as matchQuery

断了今生、忘了曾经 提交于 2019-12-20 07:15:03

问题


Am trying to fetch documents from elastic search using using matchQuery & fuzzyQuery but am getting same count of response for both the API.

For example :

Scenario 1 ( with matchQuery )

Am search for valve using matchQuery and am getting the count of 36 with the below matchQuery API

QueryBuilder qb = QueryBuilders.boolQuery()
                        .must(QueryBuilders.matchQuery("catalog_value", "valve"))
                       .filter(QueryBuilders.termQuery("locale", "en_US" ));

If i search for valves also am getting only 14 count.

Scenario 2 ( with fuzzyQuery )

Am search for valve using fuzzyQuery and am getting the count of 34 with the below fuzzyQuery API

QueryBuilder qb = QueryBuilders.boolQuery()
                      .must(QueryBuilders.fuzzyQuery("catalog_value", "valve"))
                      .filter(QueryBuilders.termQuery("locale", "en_US"));

If i search for valves also am getting only 14 count. Actually, am expecting both valve & valves should give same count if we use the fuzzyQuery.

Anyone have any idea about fuzzyQuery API.

Am using ElasticSearch 6.2.3 version.

Sample document which matches valves.

    {  
    "_index":"catalog",
    "_type":"doc",
    "_id":"517yxmQB1-MO2Tblt7C3",
    "_score":1.0,
    "_source":{  
       "catalog_name":"family451",
       "catalog_value":"These control valves range with actuation choices featuring.",
       "catalog_id":41065,
       "@version":"1",
       "locale":"en_US",
       "@timestamp":"2018-07-23T11:42:29.751Z"
}

Please find my mapping details :

        PUT catalog
    {
      "settings": {
        "analysis": {
        "analyzer": {
           "value_analyzer": {
           "type": "custom",
            "tokenizer": "whitespace",
           "char_filter": [
            "html_strip"
           ],
           "filter": ["lowercase", "asciifolding"]
         }
       }
     }
    },
    "mappings": {
      "doc": {
        "properties": {
          "catalog_value": {
            "type": "text",
            "analyzer": "value_analyzer"
          },
          "catalog_id": {
            "type": "long"
          },
         "catalog_name":{
         "type":"keyword"
         },
         "locale":{
         "type":"keyword"
         }

        }
       }
      }
}

Update

I have tested with the following scenarios and i got the below count.

valve    --> 17
valves   --> 7
valvess  --> 7
valvesss --> 8

val      --> 17
valv     --> 17

Update 2

With the below modified query and getting the below result count.

QueryBuilder qb1 = QueryBuilders.boolQuery()
            .must(QueryBuilders.fuzzyQuery("catalog_attr_value", "valve").boost(1.0f).prefixLength(0).fuzziness(Fuzziness.ONE).transpositions(true))
            .filter(QueryBuilders.termQuery("locale", "en_US"));


valve    --> 17
valves   --> 7
valvess  --> 8
valvesss --> 0

val      --> 17
valv     --> 17 

UPDATE 3

With the below fuzzyQuery am not getting proper result count for valve & Valves.

QueryBuilder qb1 = QueryBuilders.boolQuery()
                    .must(QueryBuilders.fuzzyQuery("product_attr_value", keyword).boost(1.0f).prefixLength(0).fuzziness(Fuzziness.AUTO).transpositions(true));

for, valve & valves search i got totally 10 result count (am restricting the count to 10 ) and none of the documents are matched together.

For example, below are the ids i got for valve & valves search

valve :

17194
219575
219574
280638
282288
298177
295626
4112
219069
219381

Sample response for valve search

"hits":[  
     {  
        "_index":"product_offering",
        "_type":"doc",
        "_id":"kE89_2QBfp1CuLwJxs-W",
        "_score":3.0630755,
        "_source":{  
           "product_status":"ACT",
           "@timestamp":"2018-08-03T10:03:12.194Z",
           "product_code":"M9000-560",
           "std_delivery_time":0,
           "product_attr_type":null,
           "label":"long_description",
           "product_attr_value":"Ball Valve Linkage Kit for applying M9203 and M9208 Series Actuators to VG1000 Series Valves",
           "is_product_visible":1,
           "product_name":"product17167",
           "product_group":"single",
           "product_id":17194,
           "@version":"1",
           "catalog_id":264,
           "locale":"en_US",
           "expiration_date":null,
           "product_type":"accessory",
           "min_order_quantity":1,
           "product_attr_name":"product17167_long_description"
        }
     },

Valves :

15680
15572
15599
15615
15674
15650
15526
15543
6869
6868

Sample response for valves search ( which is not available when am searching for valve)

"hits":[  
     {  
        "_index":"product_offering",
        "_type":"doc",
        "_id":"A089_2QBfp1CuLwJ-d9C",
        "_score":3.8922772,
        "_source":{  
           "product_status":"DSC",
           "@timestamp":"2018-08-03T10:03:26.343Z",
           "product_code":"M9116-GDC-1N2",
           "std_delivery_time":0,
           "product_attr_type":null,
           "label":"long_description",
           "product_attr_value":"These electric actuators have been specially designed for the motorised operation of various types of water valves and fittings such as mixing valves, butterfly valves and ball valves.",
           "is_product_visible":0,
           "product_name":"product4459",
           "product_group":"single",
           "product_id":15680,
           "@version":"1",
           "catalog_id":319,
           "locale":"en_US",
           "expiration_date":null,
           "product_type":"legacy",
           "min_order_quantity":1,
           "product_attr_name":"product4459_long_description"
        }
     },

来源:https://stackoverflow.com/questions/51633904/elasticsearch-fuzzyquery-java-api-response-are-almost-same-as-matchquery

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!