Is there a way to monitor the pod status and restart count of pods running in a GKE cluster with Stackdriver?
While I can see CPU, memory and disk usage metrics for all
In my cluster (a bare-metal k8s cluster),I use kube-state-metrics https://github.com/kubernetes/kube-state-metrics to do what you want. This project belongs to kubernetes repo and it is quite easy to use. Once deployed u can use kube_pod_container_status_restarts this metrics to know if a container restarts
Remember that, you can always raise feature request if the options available are not enough.
You can achieve this manually with the following:
In Logs Viewer, creating the following filter:
resource.labels.project_id="<PROJECT_ID>"
resource.labels.cluster_name="<CLUSTER_NAME>"
resource.labels.namespace_name="<NAMESPACE, or default>"
jsonPayload.message:"failed liveness probe"
Create a metric by clicking on the Create Metric button above the filter input and filling up the details.
You may now track this metric in Stackdriver.
Would be happy to be informed of a built-in metric instead of this.
There is a built in metric now, so it's easy to dashboard and/or alert on it without setting up custom metrics
Metric: kubernetes.io/container/restart_count
Resource type: k8s_container