问题
I am using Spring Boot Actuator dependency to get insights of application. For that, I have used Spring Boot Admin. Configuration for client-server is working fine. I have to measure the count, total-time, max for endpoints which are going to execute.
uri:/user/asset/getAllAssets
TOTAL_TIME: 831ms
MAX: 0ms
uri:/user/getEmployee/{employeeId}
TOTAL_TIME: 98ms
MAX: 0ms
Why MAX (time) is 0 while TOTAL_TIME: is Xms
While I execute generalize form
localhost:8889/actuator/metrics/http.server.requests
I get the MAX as 3.00..
I had also seen production-ready-features but not able to find any description about how MAX is calculated or what does it represent
Notes: with the number of request in an increase, COUNT, TOTAL_TIME is also getting an increase but MAX is reducing sometimes (see Request 1, Request 2 for details)
Request 1: http.server.requests
{
"name": "http.server.requests",
"description": null,
"baseUnit": "seconds",
"measurements": [
{
"statistic": "COUNT",
"value": 597
},
{
"statistic": "TOTAL_TIME",
"value": 144.9057076
},
{
"statistic": "MAX",
"value": 3.0002913
}
],
"availableTags": [
{
"tag": "exception",
"values": [
"None"
]
},
{
"tag": "method",
"values": [
"GET"
]
},
{
"tag": "uri",
"values": [
"/actuator/metrics/{requiredMetricName}",
"/**/favicon.ico",
"/actuator",
"/user/getEmployee/{employeeId}",
"/user/asset/getAllAssets",
"/actuator/health",
"/actuator/info",
"/actuator/env/{toMatch}",
"/actuator/metrics",
"/**"
]
},
{
"tag": "outcome",
"values": [
"CLIENT_ERROR",
"SUCCESS"
]
},
{
"tag": "status",
"values": [
"404",
"200"
]
}
]
}
UPDATE
localhost:8889/actuator/metrics/http.server.requests?tag=uri:/user/getEmployee/2
Response 404 (I have executed /user/getEmployee/2 before making a request for actuator)
localhost:8889/actuator/metrics/http.server.requests?tag=uri:/user/getEmployee/{employeeId}
Response 400
localhost:8889/actuator/metrics/http.server.requests?tag=uri:/user/asset/getAllAssets
{
"name": "http.server.requests",
"description": null,
"baseUnit": "seconds",
"measurements": [
{
"statistic": "COUNT",
"value": 1
},
{
"statistic": "TOTAL_TIME",
"value": 0.8311609
},
{
"statistic": "MAX",
"value": 0
}
],
"availableTags": [
{
"tag": "exception",
"values": [
"None"
]
},
{
"tag": "method",
"values": [
"GET"
]
},
{
"tag": "outcome",
"values": [
"SUCCESS"
]
},
{
"tag": "status",
"values": [
"200"
]
}
]
}
Request 2: http.server.requests
localhost:8889/actuator/metrics/http.server.requests
{
"name": "http.server.requests",
"description": null,
"baseUnit": "seconds",
"measurements": [
{
"statistic": "COUNT",
"value": 3346
},
{
"statistic": "TOTAL_TIME",
"value": 559.7992767999998
},
{
"statistic": "MAX",
"value": 2.3612968
}
],
回答1:
You can see the individual metrics by using ?tag=url:{endpoint_tag}
as defined in the response of the root /actuator/metrics/http.server.requests
call. The details of the measurements
values are;
- COUNT: Rate per second for calls.
- TOTAL_TIME: The sum of the times recorded. Reported in the monitoring system's base unit of time
- MAX: The maximum amount recorded. When this represents a time, it is reported in the monitoring system's base unit of time.
As given here, also here.
The discrepancies you are seeing is due to the presence of a timer. Meaning after some time currently defined MAX
value for any tagged metric can be reset back to 0
. Can you add some new calls to /user/asset/getAllAssets
then immediately do a call to /actuator/metrics/http.server.requests
to see a non-zero MAX
value for given tag?
This is due to the idea behind getting MAX
metric for each smaller period. When you are seeing these metrics, you will be able to get an array of MAX
values rather than a single value for a long period of time.
You can get to see this in action within Micrometer source code. There is a rotate() method focused on resetting the MAX
value to create above described behaviour.
You can see this is called for every poll() call, which is triggered every some period for metric gathering.
回答2:
The MAX
metrics is a rolling max. So it represents the maximum measurement in a rolling window.
For example if you were to scrape your metrics every minute:
Total Count Max
Minute 1 100 1 100
Minute 2 500 101 90
Minute 3 4500 1000 10
Minute 4 4500 1000 0
In minute 1 you had 1 request, and a total of 100ms, so the average duration was 100ms, and the slowest (the max) was 100ms
In minute 2 total has increased by 400 (since total is cummulative) and count has increased by 100. So average is 4ms. However since the max is 90ms, then you know that while most of your requests in that second were fast, there were still some that were slower.
In minute 3 you had 899 more requests (count) and 4000ms added to the total. (4000/899 = ~4.4ms) So your average measurement was 4.4ms and the max was 10ms.
So the purpose of the MAX is to measure the worst outlier so you know how consistent the code is performing.
Looking at minute 4, the total and count haven't increased because there were no requests. Since there were no requests, then there couldn't be a 'slowest' request for the MAX, and that is why the MAX is 0.
回答3:
- What does MAX represent
MAX represents the maximum time taken to execute endpoint.
Analysis for /user/asset/getAllAssets
COUNT TOTAL_TIME MAX
5 115 17
6 122 17 (Execution Time = 122 - 115 = 17)
7 131 17 (Execution Time = 131 - 122 = 17)
8 187 56 (Execution Time = 187 - 131 = 56)
9 204 56 From Now MAX will be 56 (Execution Time = 204 - 187 = 17)
- Will MAX be 0 if we have less number of request (or 1 request) to the particular endpoint?
No number of request for particular endPoint does not affect the MAX
- When MAX will be 0
There is Timer which set the value 0. When the endpoint is not being called or executed for sometime Timer sets MAX to 0. Here approximate timer value is 2.30 minutes (150 seconds)
- How I have determined the timer value?
For that, I have taken 6 samples (executed the same endpoint for 6 times). For that, I have determined the time difference between the time of calling the endpoint - time for when MAX set back to zero
DistributionStatisticConfig has .expiry(Duration.ofMinutes(2)).bufferLength(3)
which sets some measurements to 0 if there is no request has been made in between expiry time or rotate time.
MAX property belongs to enum Statistic which is used by Measurement (In Measurement we get COUNT, TOTAL_TIME, MAX)
public static final Statistic MAX
The maximum amount recorded. When this represents a time, it is reported in the monitoring system's base unit of time.
Notes:
This is the cases from metric for a particular endpoint (here /actuator/metrics/http.server.requests?tag=uri:/user/asset/getAllAssets
).
For generalize metric of actuator/metrics/http.server.requests
As you can see from Request 1, Request 2 (in question) the MAX has been reduced (from 3.0002913 to 2.3612968) so that maybe because of MAX for some endPoint will be set backed to 0 due to a timer. In my view for MAX for /http.server.requests
will be same as a particular endpoint. (but sure on that, investigating on it)
来源:https://stackoverflow.com/questions/57247185/spring-boot-actuator-max-property