How to not-analyze in ElasticSearch?

孤人 提交于 2019-11-27 03:59:09

问题


I've got a field in an ElasticSearch field which I do not want to have analyzed, i. e. it should be stored and compared verbatim. The values will contain letters, numbers, whitespace, dashes, slashes and maybe other characters.

If I do not give an analyzer in my mapping for this field, the default still uses a tokenizer which hacks my verbatim string into chunks of words. I don't want that.

Is there a super simple analyzer which, basically, does not analyze? Or is there a different way of denoting that this field shall not be analyzed?

I only create the index, I don't do anything else. I can use analyzers like "english" for other fields which seems to be built-in names for pre-configured analyzers. Is there a list of other names? Maybe there's one fitting my needs (namely doing nothing with the input).

This is my mapping currently:

{
  "my_type": {
    "properties": {
      "my_field1": { "type": "string", "analyzer": "english" },
      "my_field2": { "type": "string" }
    }
  }
}

my_field1 is language-dependent; this seems to work. my_field2 shall be verbatim. I'd like to give an analyzer there which simply does not do anything.

A sample value for my_field2 would be "B45c 14/04".


回答1:


"my_field2": {
    "properties": {
        "title": {
            "type": "string",
            "index": "not_analyzed"
        }
    }
}

Check you here, https://www.elastic.co/guide/en/elasticsearch/reference/1.4/mapping-core-types.html, for further info.




回答2:


This is no longer true due to the removal of the string (replaced by keyword and ``) type as described here. Instead you should use "index": true | false.

For Example OLD:

{
  "foo": {
    "type" "string",
    "index": "not_analyzed"
  }
}

becomes NEW:

{
  "foo": {
    "type" "keyword",
    "index": true
  }
}

This means the field is indexed but as it is typed as keyword not analyzed implicitly.




回答3:


keyword analyser can be also used.

// don't actually use this, use "index": "not_analyzed" instead
{
  "my_type": {
    "properties": {
      "my_field1": { "type": "string", "analyzer": "english" },
      "my_field2": { "type": "string", "analyzer": "keyword" }
    }
  }
}

As noted here: https://www.elastic.co/guide/en/elasticsearch/reference/current/analysis-keyword-analyzer.html, it makes more sense to mark those fields as not_analyzed.

But keyword analyzer can be useful when it is set by default for whole index.

UPDATE: As it said in comments, string is no longer supported in 5.X



来源:https://stackoverflow.com/questions/18235996/how-to-not-analyze-in-elasticsearch

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!