问题
I currently have some alerts set up to report when subscription/pull_request_count
is 0. However, in a similar question about that metric, I found that metrics and alerting break once there is no data, which I believe happens when there are no subscriptions.
My intent is to figure out if my servers have stopped pulling messages. There are 2 scenarios I have in mind where the details are important.
- Even if there are no messages being published, I want to know if I'm no longer pulling from a subscription to make sure things are working properly.
- In the event that a ton of unacknowledged messages are queued up just because I pulled them but didn't ack them (e.g. a partner API was down), I don't want this alert to be triggered
Besides using subscription/pull_request_count
as a condition, which won't work when no data is coming in (at least after a while), how can I set up an alert that notifies me that there no clients pulling from a Pub/Sub subscription?
回答1:
As you want to be alerted when there are no pull message operations you'll have to use the subscription/pull_request_count metric. If, after some time, the metric is dropped instead of reporting 0 pulls you can use two conditions: is absent for 3 minutes
OR is below 1 for 1 minute
:
However, the problem here is that the UI filters out all unused resources and metrics (for the past 6 weeks). While this greatly eases out setting alerts and browsing through metrics for running operations it requires a different approach to create new alerts before a system is in production. The easiest solution is to make a dummy subscription and pull messages so that the metric appears.
But you can still use the Stackdriver Monitoring API to set them up (I actually tested this with a Spanner metric in a workspace with no instances for the last few months). Keep in mind that the alerting policies API is in Beta so it's subject to non-backwards-compatible changes.
I'd recommend to start by inspecting an already existing policy with projects.alertPolicies/list and see how the AlertPolicy body is constructed.
Then you can set some initial variables:
TOKEN="$(gcloud auth print-access-token)"
PROJECT=$(gcloud config get-value project 2>\dev\null)
SUBSCRIPTION=PUBSUB_SUBSCRIPTION_ID
CHANNEL=NOTIFICATION_CHANNEL_ID
In my case I am monitoring only a specific Pub/Sub subscription throughout the example and I already had a notification channel (for my email). As you also have an existing policy you can get the notification channel ID here.
With projects.alertPolicies/create
you can create the new alert policy:
curl -X POST \
-H "Authorization: Bearer $TOKEN" \
-H "Content-Type: application/json" \
"https://monitoring.googleapis.com/v3/projects/$PROJECT/alertPolicies" \
-d @alert.json
where alert.json
is (replace the variables as needed):
{
"displayName": "no-pull-alert",
"combiner": "OR",
"conditions": [
{
"conditionAbsent": {
"filter": "metric.type=\"pubsub.googleapis.com/subscription/pull_request_count\" resource.type=\"pubsub_subscription\" resource.label.\"project_id\"=\"$PROJECT\" resource.label.\"subscription_id\"=\"$SUBSCRIPTION\"",
"duration": "180s",
"trigger": {
"count": 1
},
"aggregations": [
{
"alignmentPeriod": "60s",
"perSeriesAligner": "ALIGN_RATE"
}
]
},
"displayName": "Pull requests absent for $PROJECT, $SUBSCRIPTION"
},
{
"conditionThreshold": {
"filter": "metric.type=\"pubsub.googleapis.com/subscription/pull_request_count\" resource.type=\"pubsub_subscription\" resource.label.\"project_id\"=\"$PROJECT\" resource.label.\"subscription_id\"=\"$SUBSCRIPTION\"",
"comparison": "COMPARISON_LT",
"thresholdValue": 1,
"duration": "60s",
"trigger": {
"count": 1
},
"aggregations": [
{
"alignmentPeriod": "60s",
"perSeriesAligner": "ALIGN_RATE"
}
]
},
"displayName": "Pull requests are 0 for $PROJECT, $SUBSCRIPTION"
}
],
"documentation": {
"content": "**ALERT**\n\nNo pull message operations",
"mimeType": "text/markdown"
},
"notificationChannels": [
"projects/$PROJECT/notificationChannels/$CHANNEL"
],
"enabled": true
}
Briefly, you don't need to pass policy or condition IDs as those will be populated by the API. Use OR
as the combiner (policy violates when ANY condition is met) to trigger the alert when the metric is either absent (conditionAbsent
) or below 1 (conditionThreshold
). And, of course, you can modify parameters to better suit your use case, display names, descriptions, etc.
来源:https://stackoverflow.com/questions/57505585/how-can-i-get-a-reliable-alert-via-stackdriver-when-there-are-no-clients-pulling