I have a cluster set up in Google Kubernetes Engine (GKE), with preemptible instances, TPU support, and 1 container per node.
About twice per container per day I get