How to do a wildcard or regex match on _id in elasticsearch?

前端 未结 5 497
孤城傲影
孤城傲影 2021-01-01 23:25

From below sample elasticsearch data I want to apply wildcard say *.000ANT.* on _id so as to fetch all docs whose _id contains 0

相关标签:
5条回答
  • 2021-01-02 00:00

    You have two options here, the first is to use partial matching, which is easiest by wrapping a query with wildcards similar to other answers. This works on not_analyzed fields and is case sensitive.

    POST /my_index/my_type/_search
    {
    "query": {
        "wildcard": {
           "_id": {
              "value": "*000ANT*"
           }
        }
    }
    }
    

    The second option is to use ElasticSearch analyzers and proper mapping to describe the functionality you are looking for, you can read about those here.

    The basic premise is that you introduce an analyzer in your mapping which has a tokenizer, which will break strings down into smaller tokens that then can be matched. Doing a simple query search for "000ANT" on the tokenized _id field will return all result with that string.

    0 讨论(0)
  • 2021-01-02 00:01

    You can use a wildcard query like this, though it's worth noting that it is not advised to start a wildcard term with * as performance will suffer.

    {
      "query": {
        "wildcard": {
          "_uid": "*000ANT*"
        }
      }
    }
    

    Also note that if the wildcard term you're searching for matches the type name of your documents, using uid will not work, as uid is simply the contraction of the type and the id: type#id

    0 讨论(0)
  • 2021-01-02 00:14

    This is just an extension on Andrei Stefan's answer

    {
      "query": {
        "script": {
          "script": "doc['_id'][0].indexOf('000ANT') > -1"
        }
      }
    }
    

    Note: I do not know the performance impact of such a query, most probably this is a bad idea. Use with caution and avoid if possible.

    0 讨论(0)
  • 2021-01-02 00:19

    Allow your mapping for the id to be indexed:

    {
      "mappings": {
        "agents": {
            "_id": {
            "index": "not_analyzed"
          }
        }
      }
    }
    

    And use a query_string to search for it:

    {
      "query": {
        "query_string": {
          "query": "_id:(*000ANT*)",
          "lowercase_expanded_terms": false
        }
      }
    }
    

    Or like this (with scripts and still querying only the _id):

    {
      "query": {
        "filtered": {
          "filter": {
            "script": {
              "script": "org.elasticsearch.index.mapper.Uid.splitUidIntoTypeAndId(new org.apache.lucene.util.BytesRef(doc['_uid'].value))[1].utf8ToString().contains('000ANT')"
            }
          }
        }
      }
    }
    
    0 讨论(0)
  • 2021-01-02 00:22

    Try this

    {
       "filter": {
          "bool": {
             "must": [
                {
                   "regexp": {
                      "_uid": {
                         "value": ".*000ANT.*"
                      }
                   }
                }
             ]
          }
       }
    }
    
    0 讨论(0)
提交回复
热议问题