问题
I have 2 services A and B which I want to monitor. Also I have 2 different notification channels X and Y in the form of receivers
in the AlertManager config file.
I want to send to notify X if service A goes down and want to notify Y if service B goes down. How can I achieve this my configuration?
My AlertManager YAML file is:
route:
receiver: X
receivers:
- name: X
email_configs:
- name: Y
email_configs:
And alert.rule
files is:
groups:
- name: A
rules:
- alert: A_down
expr: expression
for: 1m
labels:
severity: critical
annotations:
summary: "A is down"
- name: B
rules:
- alert: B_down
expr: expression
for: 1m
labels:
severity: warning
annotations:
summary: "B is down"
回答1:
The config should roughly look like this (not tested):
route:
group_wait: 30s
group_interval: 5m
repeat_interval: 2h
receiver: 'default-receiver'
routes:
- match:
alertname: A_down
receiver: X
- match:
alertname: B_down
receiver: Y
The idea is, that each route field can has a routes
field, where you can put a different config, that gets enabled if the labels in match
match the condition.
回答2:
For clarifying - The General Flow to handle alert in Prometheus (Alertmanager and Prometheus integration) is like this:
SomeErrorHappenInYourConfiguredRule(Rule) -> RouteToDestination(Route) -> TriggeringAnEvent(Reciever)-> GetAMessageInSlack/PagerDuty/Mail/etc...
For example:
if my aws machine cluster production-a1 is down, I want to trigger an event sending "pagerDuty" and "Slack" to my team with the relevant error.
There's 3 files important to configure alerts on your prometheus system:
- alertmanager.yml - configuration of you routes (getting the triggered errors) and receivers (how to handle this errors)
- rules.yml - This rules will contain all the thresholds and rules you'll define in your system.
- prometheus.yml - global configuration to integrate your rules into routes and recivers together (the two above).
I'm attaching a Dummy example In order to demonstrate the idea, in this example I'll watch overload in my machine (using node exporter installed on it): On /var/data/prometheus-stack/alertmanager/alertmanager.yml
global:
# The smarthost and SMTP sender used for mail notifications.
smtp_smarthost: 'localhost:25'
smtp_from: 'JohnDoe@gmail.com'
route:
receiver: defaultTrigger
group_wait: 30s
group_interval: 5m
repeat_interval: 6h
routes:
- match_re:
service: service_overload
owner: ATeam
receiver: pagerDutyTrigger
receivers:
- name: 'pagerDutyTrigger'
pagerduty_configs:
- send_resolved: true
routing_key: <myPagerDutyToken>
Add some rule On /var/data/prometheus-stack/prometheus/yourRuleFile.yml
groups:
- name: alerts
rules:
- alert: service_overload_more_than_5000
expr: (node_network_receive_bytes_total{job="someJobOrService"} / 1000) >= 5000
for: 10m
labels:
service: service_overload
severity: pager
dev_team: myteam
annotations:
dev_team: myteam
priority: Blocker
identifier: '{{ $labels.name }}'
description: 'service overflow'
value: '{{ humanize $value }}%'
On /var/data/prometheus-stack/prometheus/prometheus.yml add this snippet to integrate alertmanager:
global:
...
alerting:
alertmanagers:
- scheme: http
static_configs:
- targets:
- "alertmanager:9093"
rule_files:
- "yourRuleFile.yml"
...
Pay attention that the key point of this example is service_overload which connects and binds the rule into the right receiver.
Reload the config (restart the service again or stop and start your docker containers) and test it, if it's configured well you can watch the alerts in http://your-prometheus-url:9090/alerts
来源:https://stackoverflow.com/questions/51485580/prometheus-alertmanager-send-alerts-to-different-clients-based-on-routes