Elastic Search Dynamic fields in hierarchy of JSON

萝らか妹 提交于 2021-02-08 08:24:42

问题


I'm going to have a JSON document where there would be nesting. Example is:

{
    "userid_int" : <integer>
    "shoes" : {
         "size_int" : <integer>,
         "brand_str" : <string>,
         "addeddate_dt" : <date>
    },
    "shirt" : {
         "size_int" : <integer>,
         "brand_str" : <string>,
         "addeddate_dt" : <date>
         "color_str" : <string>
    },
    ...
}

There is no limit on what nested fields could be. For example, I may want a new key "pyjamas" for a particular document. But this is unknown upfront while the index is being created.

All I want to know is if the dynamic field mapping apply across the JSON including inside the nesting to any levels, OR not?

Would this mapping work for all _int/_str/etc fields inside the nested fields?

PUT my_index
{
  "mappings": {
    "dynamic_templates": [
      {
        "_int_as_integers": {
          "match":   "*_int",
          "mapping": {
            "type": "integer"
          }
        },
        "_str_as_strings": {
          "match":   "*_str",
          "mapping": {
            "type": "string"
          }
        },
        ...
      }
    ]
  }
}

回答1:


It depends on what you mean by nested. There is what I call trivial nested-ness whereby you have objects within objects and there is syntactic nested-ness which has a dedicated type and certain inherent properties, the most important of which is the ability to treat the nested sub-objects as completely separate. The caveat is that the queries then need to adhere to a certain syntax.

Two remarks:

  • your dynamic template was almost correct -- just gotta make sure the individual templates are separate object definitions (see below)
  • the type string has been deprecated for a while in favor of text

  1. Trivial nested-ness
PUT trivial_nestedness
{
  "mappings": {
    "dynamic_templates": [
      {
        "int_as_integers": {
          "path_match": "*.*_int",
          "mapping": {
            "type": "integer"
          }
        }
      },
      {
        "str_as_strings": {
          "path_match": "*.*_str",
          "mapping": {
            "type": "text"
          }
        }
      }
    ]
  }
}

POST trivial_nestedness/_doc
{
    "userid_int" : 123,
    "shoes" : {
         "size_int" : 456,
         "brand_str" : "str",
         "addeddate_dt" : "2020/04/25 00:00:00"
    },
    "shirt" : {
         "size_int" : 123939,
         "brand_str" : "str",
         "addeddate_dt" : "2020/04/25 00:00:00",
         "color_str" : "red"
    }
}

then

GET trivial_nestedness/_mapping

yielding

{
  "trivial_nestedness" : {
    "mappings" : {
      "dynamic_templates" : [
       ...
      ],
      "properties" : {
        "shirt" : {
          "properties" : {
            "addeddate_dt" : {
              "type" : "date",
              "format" : "yyyy/MM/dd HH:mm:ss||yyyy/MM/dd||epoch_millis"
            },
            "brand_str" : {
              "type" : "text"
            },
            "color_str" : {
              "type" : "text"
            },
            "size_int" : {
              "type" : "integer"
            }
          }
        },
        "shoes" : {
          "properties" : {
            "addeddate_dt" : {
              "type" : "date",
              "format" : "yyyy/MM/dd HH:mm:ss||yyyy/MM/dd||epoch_millis"
            },
            "brand_str" : {
              "type" : "text"
            },
            "size_int" : {
              "type" : "integer"
            }
          }
        },
        "userid_int" : {
          "type" : "long"
        }
      }
    }
  }
}

  1. Syntactic nested-ness
PUT syntactic_nestedness
{
  "mappings": {
    "dynamic_templates": [
      {
        "possibly_nested_obj": {
          "match": "*_nested",
          "mapping": {
            "type": "nested"
          }
        }
      },
      {
        "int_as_integers": {
          "path_match": "*_nested.*_int",
          "mapping": {
            "type": "integer"
          }
        }
      },
      {
        "str_as_strings": {
          "path_match": "*_nested.*_str",
          "mapping": {
            "type": "text"
          }
        }
      }
    ]
  }
}

In this case, each shoes, shirt, ... object has a _nested suffix to clearly indicate its type and, more importantly, is an array of possibly loads of items which are thought to be separate entities.

POST syntactic_nestedness/_doc
{
  "userid_int": 123,
  "shoes_nested": [
    {
      "size_int": 456,
      "brand_str": "str",
      "addeddate_dt": "2020/04/25 00:00:00"
    }
  ],
  "shirt_nested": [
    {
      "size_int": 123939,
      "brand_str": "str",
      "addeddate_dt": "2020/04/25 00:00:00",
      "color_str": "red"
    }
  ]
}

Then

GET syntactic_nestedness/_mapping

validating that we've got truly nested objects

{
  "syntactic_nestedness" : {
    "mappings" : {
      "dynamic_templates" : [
        ...
      ],
      "properties" : {
        "shirt_nested" : {
          "type" : "nested",
          "properties" : {
            "addeddate_dt" : {
              "type" : "date",
              "format" : "yyyy/MM/dd HH:mm:ss||yyyy/MM/dd||epoch_millis"
            },
            "brand_str" : {
              "type" : "text"
            },
            "color_str" : {
              "type" : "text"
            },
            "size_int" : {
              "type" : "integer"
            }
          }
        },
        "shoes_nested" : {
          "type" : "nested",
          "properties" : {
            "addeddate_dt" : {
              "type" : "date",
              "format" : "yyyy/MM/dd HH:mm:ss||yyyy/MM/dd||epoch_millis"
            },
            "brand_str" : {
              "type" : "text"
            },
            "size_int" : {
              "type" : "integer"
            }
          }
        },
        "userid_int" : {
          "type" : "long"
        }
      }
    }
  }
}

Lastly, while I chose the _nested suffix mostly for clarity, I also wanted to avoid top-level keys like userid_int. This, though, can be elegantly solved by your keys' match patterns.



来源:https://stackoverflow.com/questions/61436941/elastic-search-dynamic-fields-in-hierarchy-of-json

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!