Elasticsearch: Run aggregation on field & filter out specific values using a regexp not matching values

限于喜欢 提交于 2019-12-24 06:40:42

问题


I'm trying to run an aggregation on a field & ignore specific values! So I've got a field path that holds a heap of different url paths.

{
   "size": 0,
   "aggs": {
      "paths": {
            "terms":{
               "field": "path" // Count the no unique path ~> values
            }

      }
   },
   "filter": {
      "bool": {
         "must_not": [
            {
               "regexp": {
                  // path MUST NOT CONTAIN media | cache
                  "path": {
                    "value": "(\/media\b|\bcache\b)"
                  }
               }
            }
         ]
      }
   }
}

When running this, it doesn't filter out the documents which have a path that contains cache or media?!

If I remove the filter, the same results would be returned if I left it in.


回答1:


You could try excluding those values inside the terms aggregation like this

{
  "size": 0,
  "aggs": {
    "path": {
      "terms": {
        "field": "path",
        "exclude": ".*(media|cache).*"
      }
    }
  }
}

Caution: From the documentation

Note: The performance of a regexp query heavily depends on the regular expression chosen. Matching everything like .* is very slow as well as using lookaround regular expressions. If possible, you should try to use a long prefix before your regular expression starts

Another approach would be to get rid of those documents in query stage so you could move your filter to query and then aggregate on remaining results.

EDIT : With date filter

You could add date filter to query so that you would get only past day's results, something like this would work.

{
  "query": {
    "range": {
      "name_of_date_field": {
        "gte": "now-1d"
      }
    }
  },
  "size": 0,
  "aggs": {
    "path": {
      "terms": {
        "field": "path",
        "exclude": ".*(media|cache).*"
      }
    }
  }
}


来源:https://stackoverflow.com/questions/39737104/elasticsearch-run-aggregation-on-field-filter-out-specific-values-using-a-reg

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!