What I want to achieve is aggregation by unique pairs (city, STATE). As per Elasticsearch documentation The terms aggregation does not support collecting terms from multiple fields in the same document. Thus I created a nested agg like this:
{
"size": 0,
"aggs": {
"cities": {
"terms": {
"field": "address.city",
"size": 12
},
"aggs": {
"states": {
"terms": {
"field": "address.stateOrProvince"
},
"aggs": {
"topCity": {
"top_hits": {
"size": 1,
"sort": [
{
"price.value": {
"order": "desc" }}]}}}}}}}}
As a result of this aggregation I get response like this:
{
"aggregations": {
"cities": {
"buckets": [
{
"key": "las vegas",
"doc_count": 5927,
"states": {
"buckets": [
{ "key": "nv", "doc_count": 5840 },
{ "key": "nm", "doc_count": 85 }
]
}
},
{
"key": "jacksonville",
"doc_count": 5689,
"states": {
"buckets": [
{ "key": "fl", "doc_count": 2986 },
{ "key": "nc", "doc_count": 1962 },
{ "key": "ar", "doc_count": 290 }]}}]}}}
The question is how to get results ordered by the deepest doc_count?
Expected ordered list should be like this:
- las vegas, nv (5840)
- jacksonville, fl (2986)
- jacksonville, nc (1962)
- jacksonville, ar (290)
- las vegas, nm (85)
I don't believe there is a way to sort on the inner doc_count accross multiple buckets. In ES 2.0 (still in Beta) you'll be able to take action on aggregations but that's not possible in ES 1.x
I've managed to solve the problem by applying transform
"transform": {
"script": "ctx._source['address']['cityState'] = ctx._source['address']['city'] + ', ' + ctx._source['address']['state']"
}
and then aggregating on the newly added field. Works as expected!
来源:https://stackoverflow.com/questions/32908712/elasticsearch-aggregation-order-by-nested-bucket-doc-count