问题
I'm going to have a JSON document where there would be nesting. Example is:
{
"userid_int" : <integer>
"shoes" : {
"size_int" : <integer>,
"brand_str" : <string>,
"addeddate_dt" : <date>
},
"shirt" : {
"size_int" : <integer>,
"brand_str" : <string>,
"addeddate_dt" : <date>
"color_str" : <string>
},
...
}
There is no limit on what nested fields could be. For example, I may want a new key "pyjamas" for a particular document. But this is unknown upfront while the index is being created.
All I want to know is if the dynamic field mapping apply across the JSON including inside the nesting to any levels, OR not?
Would this mapping work for all _int/_str/etc fields inside the nested fields?
PUT my_index
{
"mappings": {
"dynamic_templates": [
{
"_int_as_integers": {
"match": "*_int",
"mapping": {
"type": "integer"
}
},
"_str_as_strings": {
"match": "*_str",
"mapping": {
"type": "string"
}
},
...
}
]
}
}
回答1:
It depends on what you mean by nested
. There is what I call trivial nested-ness whereby you have objects within objects and there is syntactic nested-ness which has a dedicated type and certain inherent properties, the most important of which is the ability to treat the nested sub-objects as completely separate. The caveat is that the queries then need to adhere to a certain syntax.
Two remarks:
- your dynamic template was almost correct -- just gotta make sure the individual templates are separate object definitions (see below)
- the type
string
has been deprecated for a while in favor oftext
- Trivial nested-ness
PUT trivial_nestedness
{
"mappings": {
"dynamic_templates": [
{
"int_as_integers": {
"path_match": "*.*_int",
"mapping": {
"type": "integer"
}
}
},
{
"str_as_strings": {
"path_match": "*.*_str",
"mapping": {
"type": "text"
}
}
}
]
}
}
POST trivial_nestedness/_doc
{
"userid_int" : 123,
"shoes" : {
"size_int" : 456,
"brand_str" : "str",
"addeddate_dt" : "2020/04/25 00:00:00"
},
"shirt" : {
"size_int" : 123939,
"brand_str" : "str",
"addeddate_dt" : "2020/04/25 00:00:00",
"color_str" : "red"
}
}
then
GET trivial_nestedness/_mapping
yielding
{
"trivial_nestedness" : {
"mappings" : {
"dynamic_templates" : [
...
],
"properties" : {
"shirt" : {
"properties" : {
"addeddate_dt" : {
"type" : "date",
"format" : "yyyy/MM/dd HH:mm:ss||yyyy/MM/dd||epoch_millis"
},
"brand_str" : {
"type" : "text"
},
"color_str" : {
"type" : "text"
},
"size_int" : {
"type" : "integer"
}
}
},
"shoes" : {
"properties" : {
"addeddate_dt" : {
"type" : "date",
"format" : "yyyy/MM/dd HH:mm:ss||yyyy/MM/dd||epoch_millis"
},
"brand_str" : {
"type" : "text"
},
"size_int" : {
"type" : "integer"
}
}
},
"userid_int" : {
"type" : "long"
}
}
}
}
}
- Syntactic nested-ness
PUT syntactic_nestedness
{
"mappings": {
"dynamic_templates": [
{
"possibly_nested_obj": {
"match": "*_nested",
"mapping": {
"type": "nested"
}
}
},
{
"int_as_integers": {
"path_match": "*_nested.*_int",
"mapping": {
"type": "integer"
}
}
},
{
"str_as_strings": {
"path_match": "*_nested.*_str",
"mapping": {
"type": "text"
}
}
}
]
}
}
In this case, each shoes
, shirt
, ... object has a _nested
suffix to clearly indicate its type and, more importantly, is an array of possibly loads of items which are thought to be separate entities.
POST syntactic_nestedness/_doc
{
"userid_int": 123,
"shoes_nested": [
{
"size_int": 456,
"brand_str": "str",
"addeddate_dt": "2020/04/25 00:00:00"
}
],
"shirt_nested": [
{
"size_int": 123939,
"brand_str": "str",
"addeddate_dt": "2020/04/25 00:00:00",
"color_str": "red"
}
]
}
Then
GET syntactic_nestedness/_mapping
validating that we've got truly nested objects
{
"syntactic_nestedness" : {
"mappings" : {
"dynamic_templates" : [
...
],
"properties" : {
"shirt_nested" : {
"type" : "nested",
"properties" : {
"addeddate_dt" : {
"type" : "date",
"format" : "yyyy/MM/dd HH:mm:ss||yyyy/MM/dd||epoch_millis"
},
"brand_str" : {
"type" : "text"
},
"color_str" : {
"type" : "text"
},
"size_int" : {
"type" : "integer"
}
}
},
"shoes_nested" : {
"type" : "nested",
"properties" : {
"addeddate_dt" : {
"type" : "date",
"format" : "yyyy/MM/dd HH:mm:ss||yyyy/MM/dd||epoch_millis"
},
"brand_str" : {
"type" : "text"
},
"size_int" : {
"type" : "integer"
}
}
},
"userid_int" : {
"type" : "long"
}
}
}
}
}
Lastly, while I chose the _nested
suffix mostly for clarity, I also wanted to avoid top-level keys like userid_int
. This, though, can be elegantly solved by your keys' match patterns.
来源:https://stackoverflow.com/questions/61436941/elastic-search-dynamic-fields-in-hierarchy-of-json