Wilcard search or partial matching in Elastic search

前端未结

关注

 2  1605

I am trying to provide the search to end user with type as they go which is is more like sqlserver. I was able to implement ES query for the given sql scenario:

相关标签:

2条回答

一向

2021-01-17 01:41

The most efficient solution involves leveraging an ngram tokenizer in order to tokenize portions of your name field. For instance, if you have a name like peter tomson, the ngram tokenizer will tokenize and index it like this:

pe
pet
pete
peter
peter t
peter to
peter tom
peter toms
peter tomso
eter tomson
ter tomson
er tomson
r tomson
tomson
tomson
omson
mson
son
on

So, when this has been indexed, searching for any of those tokens will retrieve your document with peter thomson in it.

Let's create the index:

PUT likequery
{
  "settings": {
    "analysis": {
      "analyzer": {
        "my_ngram_analyzer": {
          "tokenizer": "my_ngram_tokenizer"
        }
      },
      "tokenizer": {
        "my_ngram_tokenizer": {
          "type": "nGram",
          "min_gram": "2",
          "max_gram": "15"
        }
      }
    }
  },
  "mappings": {
    "typename": {
      "properties": {
        "name": {
          "type": "string",
          "fields": {
            "search": {
              "type": "string",
              "analyzer": "my_ngram_analyzer"
            }
          }
        },
        "type": {
          "type": "string",
          "index": "not_analyzed"
        }
      }
    }
  }
}

You'll then be able to search like this with a simple and very efficient term query:

POST likequery/_search
{
  "query": {
    "bool": {
      "should": [
        {
          "term": {
            "name.search": "peter tom"
          }
        }
      ],
      "must_not": [
        {
          "match": {
            "type": "xyz"
          }
        },
        {
          "match": {
            "type": "abc"
          }
        }
      ]
    }
  }
}

0 讨论(0)

野趣味

2021-01-17 01:53

Well my solution is not perfect and I am not sure about performance. So you should try it on your own risk :)

This is es 5 version

PUT likequery
{
  "mappings": {
    "typename": {
      "properties": {
        "name": {
          "type": "string",
          "fields": {
            "raw": {
              "type": "keyword"
            }
          }
        },
        "type": {
          "type": "string"
        }
      }
    }
  }
}

in ES 2.1 change "type": "keyword" to "type": "string", "index": "not_analyzed"

PUT likequery/typename/1
{
  "name": "peter tomson"
}

PUT likequery/typename/2
{
  "name": "igor tkachenko"
}

PUT likequery/typename/3
{
  "name": "taras shevchenko"
}

Query is case sensetive

POST likequery/_search
{
  "query": {
    "regexp": {
      "name.raw": ".*taras shev.*"
    }
  }
}

Response

{
  "took": 5,
  "timed_out": false,
  "_shards": {
    "total": 5,
    "successful": 5,
    "failed": 0
  },
  "hits": {
    "total": 1,
    "max_score": 1,
    "hits": [
      {
        "_index": "likequery",
        "_type": "typename",
        "_id": "3",
        "_score": 1,
        "fields": {
          "raw": [
            "taras shevchenko"
          ]
        }
      }
    ]
  }
}

PS. Once again I am not sure about performance of this query since it will use scan and not index.

0 讨论(0)