What I want to achieve is aggregation by unique pairs (city, STATE). As per Elasticsearch documentation The terms aggregation does not support collecting terms from multiple
I've managed to solve the problem by applying transform
"transform": {
"script": "ctx._source['address']['cityState'] = ctx._source['address']['city'] + ', ' + ctx._source['address']['state']"
}
and then aggregating on the newly added field. Works as expected!
I don't believe there is a way to sort on the inner doc_count accross multiple buckets. In ES 2.0 (still in Beta) you'll be able to take action on aggregations but that's not possible in ES 1.x