Is there any better way in elasticsearch (other than issuing a match all query and manually averaging over the length of all returned documents) to get the average document
In ElasticSearch 6.2 you should just use the following line (no need to add 'terms'):
"aggs" :
{"avg_size" :
{"avg" :
{"field" : "_size"}}}
See details here: https://www.elastic.co/guide/en/elasticsearch/reference/current/search-aggregations-metrics-avg-aggregation.html
Shot in the dark, but facets or aggregations combined with a script might do it.
{
...,
"aggs" : {
"avg_length" : { "avg" : { "script" : "doc['_all'].length" } }
}
}
I have used this code (I have the _source enabled)
{
"query" : {"match_all" : {}},
"aggs":{
"avg_length" : { "avg" : { "script" : "_source.toString().length()"}}
}
}
Well, the chars .. .if the string are UTF-8 to get the bytes:
{
"query" : {"match_all" : {}},
"aggs":{
"avg_length" : { "avg" : { "script" : "_source.toString().getBytes(\"UTF-8\").length"}}
}
}
The _size mapping field, if enabled, should give you the size of each document for free. Combining this with the avg
aggregation should get you what you want. Something like:
{
"query" : {"match_all" : {}},
"aggs" : {"avg_size" : {"avg" : {"terms" : {"field" : "_size"}}}}
}