Monitoring and alerting on pod status or restart with Google Container Engine (GKE) and Stackdriver

前端 未结 4 2090
伪装坚强ぢ
伪装坚强ぢ 2021-02-05 09:53

Is there a way to monitor the pod status and restart count of pods running in a GKE cluster with Stackdriver?

While I can see CPU, memory and disk usage metrics for all

相关标签:
4条回答
  • 2021-02-05 10:14

    In my cluster (a bare-metal k8s cluster),I use kube-state-metrics https://github.com/kubernetes/kube-state-metrics to do what you want. This project belongs to kubernetes repo and it is quite easy to use. Once deployed u can use kube_pod_container_status_restarts this metrics to know if a container restarts

    0 讨论(0)
  • 2021-02-05 10:28

    Remember that, you can always raise feature request if the options available are not enough.

    0 讨论(0)
  • 2021-02-05 10:29

    You can achieve this manually with the following:

    1. In Logs Viewer, creating the following filter:

      resource.labels.project_id="<PROJECT_ID>"
      resource.labels.cluster_name="<CLUSTER_NAME>"
      resource.labels.namespace_name="<NAMESPACE, or default>"
      jsonPayload.message:"failed liveness probe"
      
    2. Create a metric by clicking on the Create Metric button above the filter input and filling up the details.

    3. You may now track this metric in Stackdriver.

    Would be happy to be informed of a built-in metric instead of this.

    0 讨论(0)
  • 2021-02-05 10:29

    There is a built in metric now, so it's easy to dashboard and/or alert on it without setting up custom metrics

    Metric: kubernetes.io/container/restart_count
    Resource type: k8s_container
    
    0 讨论(0)
提交回复
热议问题