Elasticsearch exclude top hit on field value

前端 未结 4 694
遇见更好的自我
遇见更好的自我 2021-01-17 19:23
{\'country\': \'France\', \'collected\': \'2018-03-12\', \'active\': true}
{\'country\': \'France\', \'collected\': \'2018-03-13\', \'active\': true}
{\'country\': \         


        
4条回答
  •  悲哀的现实
    2021-01-17 20:26

    Generally, you can nest aggregations as needed to achieve any outcome. In this case, adding a filter bucket aggregation in between should achieve the desired outcome.

    {
      "size": 0,
      "aggs": {
        "group": {
          "terms": { "field": "country" },
          "aggs": {
            "active_in_group": {
              "filter" : { "term": { "active": true } },
              "aggs": {
                "group_docs": {
                  "top_hits": {
                    "size": 1,
                    "sort": [
                      { "collected": { "order": "desc" } }
                    ]
                  }
                }
              }
            }
          }
        }
      }
    }
    

    Here you have:

    Agg level 1 - terms bucket; what is the count of each country in your result set (active or inactive)

    Agg level 2 - filter bucket; what is the count of active items within each country bucket

    Agg level 3 - top hits; what is the top result (most recently collected, according to your sort) of the active items within each country bucket

    As you can see, any nested aggregation always respects the aggregations it is nested within.

    One thing I'm unclear on, is if you want the count within each country bucket to reflect only the active items, or also the inactive items, or if you don't care about the counts at all and you're just using the term buckets to get the top hits within each country.

    If you want the counts to reflect only the active items, then reverse the term and the filter aggregations, if you want the counts to include active and inactive, keep this order. If you don't care about the counts, the order doesn't matter.

    This will of course add a level of aggregation to your results (the count of active items within each country), but that should be easy enough to overcome / ignore when parsing the results.

    This solution has been verified to work in elastic 6.X, but I can see you must still be on elastic 1.x for some reason since you're using search_type=count which was deprecated in elastic 2.x. This solution should still work since these specific aggregations haven't changed for some time but I can't verify that there isn't some bug or something that has since been patched since elastic 1.x is very out of date. For future ref, elastic changes a lot from version to version. You generally want to include your version in any questions about elastic and check the version on any answers. In any event, I'd recommend an upgrade if you can.

提交回复
热议问题