ElasticSearch not returning results for terms query against string property

前端 未结 2 1773
盖世英雄少女心
盖世英雄少女心 2020-12-23 17:27

I have the following indexed document:

{
    \"visitor\": {
        \"id\": 
    }
}

The mapping for the document

相关标签:
2条回答
  • 2020-12-23 17:50

    Unless you specify the visitor.id field NOT to be analyzed, every fields are analyzed by default.

    It means that "ABC" will be indexed as "abc" (lower case).

    You have to use term query or term filter with string in LOWER CASE.

    I hope the query below will work. ^^

    {
        "query": {
            "filtered": {
                "query": {
                    "match_all": {}
                 }
            },
            "filter": {
                "term": { "visitor.id": "abc" }
            }
        }
    }
    
    0 讨论(0)
  • 2020-12-23 17:57

    You need to understand how elasticsearch's analyzers work. Analyzers perform a tokenization (split an input into a bunch of tokens, such as on whitespace), and a set of token filters (filter out tokens you don't want, like stop words, or modify tokens, like the lowercase token filter which converts everything to lower case).

    Analysis is performed at two very specific times - during indexing (when you put stuff into elasticsearch) and, depending on your query, during searching (on the string you're searching for).

    That said, the default analyzer is the standard analyzer which consists of a standard tokenizer, standard token filter (to clean up tokens from the standard tokenizer), lowercase token filter, and stop words token filter.

    To put this to an example, when you save the string "I love Vincent's pie!" into elasticsearch, and you're using the default standard analyzer, you're actually storing "i", "love", "vincent", "s", "pie". Then, when you attempt to search for "Vincent's" with a term query (which is not analyzed), you will not find anything because "Vincent's" is not one of those tokens! However, if you search for "Vincent's" using a match query (which is analyzed), you will find "I love Vincent's pie!" because "vincent" and "s" both find matches.

    The bottom line, either:

    1. Use an analyzed query, such as match, when searching natural language strings.
    2. Set up the analyzers to match your needs. You could set up set up a custom analyzer that performs a whitespace tokenizer or a letter tokenizer or a pattern tokenizer if you want to get complicated, as well as whatever filters your heart desires. It depends on your use case, but if you're dealing with natural language sentences I don't recommend this because the standard tokenizer was built for natural language searching.
    3. You can set the field up to not use an analyzer with the following mapping, which should suit your needs:

      "visitor": {
          "properties": {
              "id": {
                  "type": "string"
                  "index": "not_analyzed"
              }
          }
      }
      

    See http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/analysis.html for further reading.

    0 讨论(0)
提交回复
热议问题