Prometheus(2)_AlertManager报警

喜你入骨 提交于 2019-11-27 13:08:52

报警:指prometheus将监测到的异常事件发送给alertmanager,而不是指发送邮件通知
通知:指alertmanager发送异常事件的通知(邮件、webhook等)包括silencing、inhibition,聚合报警信息过后通过email、PagerDuty、HipChat、Slack 等方式发送消息提示

配置 AlertManger:配置报警方式

#alert-cm.yaml
kind: ConfigMap
apiVersion: v1
metadata:
  name: alertmanager-config
  namespace: kube-system
data:
  config.yml: |-
    global:
      smtp_smarthost: 'smtp.163.com:25'    #邮箱服务器:此为163邮箱
      smtp_from: 'username@163.com'
      smtp_auth_username: 'username@163.com'
      smtp_auth_password: "password"     #邮箱密码或者客户端授权码
      smtp_require_tls: false
    route:
      group_by: [alertname]
      group_wait: 30s
      group_interval: 5m
      repeat_interval: 10m
      receiver: default-receiver
    receivers:
    - name: 'default-receiver'
      email_configs:
      - to: '*************'

安装AlertManger

#alert-de.yaml
kind: Deployment
metadata:
  labels:
    name: alertmanager-deployment
  name: alertmanager
  namespace: kube-system
spec:
  replicas: 1
  selector:
    matchLabels:
      app: alertmanager
  template:
    metadata:
      labels:
        app: alertmanager
    spec:
      containers:
      - name: alertmanager
        image: prom/alertmanager
        imagePullPolicy: IfNotPresent
        env:
        - name: POD_IP
          valueFrom:
            fieldRef:
              apiVersion: v1
              fieldPath: status.podIP
        args:
        - "--config.file=/etc/alertmanager/config.yml"  #指定alertmanager配置文件路径
        - "--storage.path=/alertmanager/data"   #指定数据存储路径
        - "--cluster.listen-address=$(POD_IP):6783"   
        ports:
        - containerPort: 9093
          name: http
        volumeMounts:
        - mountPath: "/etc/alertmanager"
          name: alertcfg
        resources:
          requests:
            cpu: 100m
            memory: 256Mi
          limits:
            cpu: 100m
            memory: 256Mi
      serviceAccountName: prometheus               #此处使用prometheus权限 (见prometheus安装文档)
      volumes:
      - name: alertcfg
        configMap:
          name: alertmanager-config
      - name: data
        emptyDir: {}
#alert-svc.yaml
#svc暴露端口
---
kind: Service
apiVersion: v1
metadata:
  labels:
    app: alertmanager
  name: alertmanager
  namespace: kube-system
spec:
  type: NodePort
  ports:
  - port: 9093
    targetPort: 9093
    nodePort: 31000
  selector:
    app: alertmanager

配置Prometheus来和AlertManager通信 (添加 prometheus 中prome-cm.yamll)

    rule_files:
    - /etc/prometheus/rules.yml
    alerting:
      alertmanagers:
      - static_configs:
        - targets: ["SVC_IP:31000"]

Prometheus中创建报警规则(添加 prometheus 中prome-cm.yaml)

  rules.yml: |
    groups:
    - name: example
      rules:
      - alert: InstanceDown
        expr: up == 0
        for: 5m
        labels:
          severity: page
        annotations:
          summary: "Instance {{ $labels.instance }} down"
          description: "{{ $labels.instance }} of job {{ $labels.job }} has been down for more than 5 minutes."

创建

kubectl create -f   alert-cm.yaml
kubectl create -f   alert-de.yaml
kubectl create -f   alert-svc.yaml
#prometheus 
kubectl apply -f prome-cm.yaml
删除prometheus pod

页面访问:http://node_IP:31000
在这里插入图片描述
邮件报警如下:
在这里插入图片描述

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!