问题
I have an index with fields:
- room_name
- start_date (start time room is used)
- end_date (end time room is used)
I am creating a curl command wherein I can get the time when a room was used.
Is it possible?
Here is current curl command:
curl -XGET "https://localhost:9200/testindex/_search?pretty" -H 'Content-Type: application/json' -d'
{
"aggs": {
"room_bucket":{
"terms": {
"field": "room_name.keyword",
},
"aggs":{
"hour_bucket": {
"terms": {
"script": {
"inline": "def l = doc[\"start_date \"].value;\nif ( l <= 20 && l >= 9 ) {\n return l;\n}",
"lang": "painless"
},
"order": {
"_key": "asc"
},
"value_type": "long"
}
}
}
}
}
}'
Here is the result:
{
"took" : 1,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : 3,
"max_score" : 1.0,
"hits" : [
{
"_index" : "testindex",
"_type" : "_doc",
"_id" : "1",
"_score" : 1.0,
"_source" : {
"log_version" : 1,
"start_date" : 10,
"end_date" : 11,
"room_name" : "room_Y"
}
},
{
"_index" : "testindex",
"_type" : "_doc",
"_id" : "2",
"_score" : 1.0,
"_source" : {
"log_version" : 1,
"start_date" : 11,
"end_date" : 13,
"room_name" : "room_V"
}
},
{
"_index" : "testindex",
"_type" : "_doc",
"_id" : "3",
"_score" : 1.0,
"_source" : {
"log_version" : 1,
"start_date" : 10,
"end_date" : 12,
"room_name" : "room_Y"
}
}
]
},
"aggregations" : {
"room_bucket" : {
"doc_count_error_upper_bound" : 0,
"sum_other_doc_count" : 0,
"buckets" : [
{
"key" : "room_V",
"doc_count" : 1,
"hour_bucket" : {
"doc_count_error_upper_bound" : 0,
"sum_other_doc_count" : 0,
"buckets" : [
{
"key" : 11,
"doc_count" : 1
}
]
}
},
{
"key" : "room_Y",
"doc_count" : 1,
"hour_bucket" : {
"doc_count_error_upper_bound" : 0,
"sum_other_doc_count" : 0,
"buckets" : [
{
"key" : 10,
"doc_count" : 1
}
]
}
}
]
}
}
}
But my expected result in the "aggregations" is the following:
"aggregations" : {
"room_bucket" : {
"doc_count_error_upper_bound" : 0,
"sum_other_doc_count" : 0,
"buckets" : [
{
"key" : "room_V",
"doc_count" : 1,
"hour_bucket" : {
"doc_count_error_upper_bound" : 0,
"sum_other_doc_count" : 0,
"buckets" : [
{
"key" : 11,
"doc_count" : 1
},
{
"key" : 12,
"doc_count" : 1
},
{
"key" : 13,
"doc_count" : 1
}
]
}
},
{
"key" : "room_Y",
"doc_count" : 1,
"hour_bucket" : {
"doc_count_error_upper_bound" : 0,
"sum_other_doc_count" : 0,
"buckets" : [
{
"key" : 10,
"doc_count" : 2
},
{
"key" : 11,
"doc_count" : 2
},
{
"key" : 12,
"doc_count" : 1
}
]
}
}
]
}
}
In the current result, it only reads the start_date.
However, in the expected output, Room_V should have "key" = 11, "key" = 12, "key" = 13 (doc_count should be 1 for each key) because based on start_date and end_date, the room was used from 11 - 13.
回答1:
You can achieve what you want by leveraging LongStream
and creating an array of all the hours in the interval, like this:
curl -XGET "https://localhost:9200/testindex/_search?pretty" -H 'Content-Type: application/json' -d'
{
"aggs": {
"room_bucket": {
"terms": {
"field": "room_name.keyword"
},
"aggs": {
"hour_bucket": {
"terms": {
"script": {
"inline": """
return LongStream.rangeClosed(doc.start_date.value, doc.end_date.value).toArray();
""",
"lang": "painless"
},
"order": {
"_key": "asc"
},
"value_type": "long"
}
}
}
}
}
}'
来源:https://stackoverflow.com/questions/54195303/how-to-derive-a-field-from-two-fields-in-an-elasticsearch-index