Wilcard search or partial matching in Elastic search

前端 未结 2 1603
广开言路
广开言路 2021-01-17 01:22

I am trying to provide the search to end user with type as they go which is is more like sqlserver. I was able to implement ES query for the given sql scenario:



        
相关标签:
2条回答
  • 2021-01-17 01:41

    The most efficient solution involves leveraging an ngram tokenizer in order to tokenize portions of your name field. For instance, if you have a name like peter tomson, the ngram tokenizer will tokenize and index it like this:

    • pe
    • pet
    • pete
    • peter
    • peter t
    • peter to
    • peter tom
    • peter toms
    • peter tomso
    • eter tomson
    • ter tomson
    • er tomson
    • r tomson
    • tomson
    • tomson
    • omson
    • mson
    • son
    • on

    So, when this has been indexed, searching for any of those tokens will retrieve your document with peter thomson in it.

    Let's create the index:

    PUT likequery
    {
      "settings": {
        "analysis": {
          "analyzer": {
            "my_ngram_analyzer": {
              "tokenizer": "my_ngram_tokenizer"
            }
          },
          "tokenizer": {
            "my_ngram_tokenizer": {
              "type": "nGram",
              "min_gram": "2",
              "max_gram": "15"
            }
          }
        }
      },
      "mappings": {
        "typename": {
          "properties": {
            "name": {
              "type": "string",
              "fields": {
                "search": {
                  "type": "string",
                  "analyzer": "my_ngram_analyzer"
                }
              }
            },
            "type": {
              "type": "string",
              "index": "not_analyzed"
            }
          }
        }
      }
    }
    

    You'll then be able to search like this with a simple and very efficient term query:

    POST likequery/_search
    {
      "query": {
        "bool": {
          "should": [
            {
              "term": {
                "name.search": "peter tom"
              }
            }
          ],
          "must_not": [
            {
              "match": {
                "type": "xyz"
              }
            },
            {
              "match": {
                "type": "abc"
              }
            }
          ]
        }
      }
    }
    
    0 讨论(0)
  • 2021-01-17 01:53

    Well my solution is not perfect and I am not sure about performance. So you should try it on your own risk :)

    This is es 5 version

    PUT likequery
    {
      "mappings": {
        "typename": {
          "properties": {
            "name": {
              "type": "string",
              "fields": {
                "raw": {
                  "type": "keyword"
                }
              }
            },
            "type": {
              "type": "string"
            }
          }
        }
      }
    }
    

    in ES 2.1 change "type": "keyword" to "type": "string", "index": "not_analyzed"

    PUT likequery/typename/1
    {
      "name": "peter tomson"
    }
    
    PUT likequery/typename/2
    {
      "name": "igor tkachenko"
    }
    
    PUT likequery/typename/3
    {
      "name": "taras shevchenko"
    }
    

    Query is case sensetive

    POST likequery/_search
    {
      "query": {
        "regexp": {
          "name.raw": ".*taras shev.*"
        }
      }
    }
    

    Response

    {
      "took": 5,
      "timed_out": false,
      "_shards": {
        "total": 5,
        "successful": 5,
        "failed": 0
      },
      "hits": {
        "total": 1,
        "max_score": 1,
        "hits": [
          {
            "_index": "likequery",
            "_type": "typename",
            "_id": "3",
            "_score": 1,
            "fields": {
              "raw": [
                "taras shevchenko"
              ]
            }
          }
        ]
      }
    }
    

    PS. Once again I am not sure about performance of this query since it will use scan and not index.

    0 讨论(0)
提交回复
热议问题