问题
According to Prometheus documentation in order to have a 95th percentile using histogram metric I can use following query:
histogram_quantile(0.95, sum(rate(http_request_duration_seconds_bucket[5m])) by (le))
Source: https://prometheus.io/docs/practices/histograms/#quantiles
Since each bucket of histogram is a counter we can calculate rate each of the buckets as:
per-second average rate of increase of the time series in the range vector.
See: https://prometheus.io/docs/prometheus/latest/querying/functions/#rate
So, for instance, if bucket value[t-5m] = 100 and bucket value[t] = 200 then bucket rate[t] = (200-100)/(10*60) = 0.167
And finally, the most confusing part is how can histogram_quantile function find 95th percentile for given metric knowing all the bucket rates?
Is there any code or algorithm I can take a look to better understand it?
回答1:
I believe this is the code for it in prometheus
The general idea is that you use the data in the buckets to extrapolate / approximate the quantiles
Elasticsearch also does something similar (yet different/much simpler) in their rollup capabilities
回答2:
You have to use reset
because counters can be reset, rate
automatically considers resets and give you the right count for each second. Just remember that always use rate before using counters.
回答3:
You can refer to my reply here
Actually the rate() function is just used to specify the time window, the denominator has no effect in the computation of the pecentile value.
来源:https://stackoverflow.com/questions/55162093/understanding-histogram-quantile-based-on-rate-in-prometheus