stackdriver-metadata-agent-cluster-level gets OOMKilled

后端 未结 2 716
被撕碎了的回忆
被撕碎了的回忆 2021-02-04 09:50

I updated a GKE cluster from 1.13 to 1.15.9-gke.12. In the process I switched from legacy logging to Stackdriver Kubernetes Engine Monitoring. Now I have the problem that the

相关标签:
2条回答
  • 2021-02-04 10:10

    I was about to open a support ticket with GCP, but they have this notice:

    Description We are experiencing issue with Fluentd crashlooping in Google Kubernetes Engine where master version is 1.14 or 1.15, when gVisor is enabled. The fix is targeted for a release aiming to begin on 17 April 2020. We will provide more updates as the date gets closer. We will provide an update by Thursday, 2020-04-09 14:30 US/Pacific with current details. We apologize to all who are affected by the disruption.

    Start time April 2, 2020 at 10:58:24 AM GMT-7

    End time Steps to reproduce Fluentd crashloops in GKE clusters could lead to missing logs.

    Workaround Upgrade Google Kubernetes Engine cluster masters to version 1.16+.

    Affected products Other

    0 讨论(0)
  • 2021-02-04 10:28

    The issue is being caused because the LIMIT set on the metadata-agent deployment is too low on resources so the POD is being killed (OOM killed) since the POD requires more memory to properly work.

    There is a workaround for this issue until it is fixed.


    You can overwrite the base resources in the configmap of the metadata-agent with:

    kubectl edit cm -n kube-system metadata-agent-config

    Setting baseMemory: 50Mi should be enough, if it doesn't work use higher value 100Mi or 200Mi.

    So metadata-agent-config configmap should look something like this:

    apiVersion: v1
    data:
      NannyConfiguration: |-
        apiVersion: nannyconfig/v1alpha1
        kind: NannyConfiguration
        baseMemory: 50Mi
    kind: ConfigMap
    

    Note also that You need to restart the deployment, as the config map doesn't get picked up automatically:

    kubectl delete deployment -n kube-system stackdriver-metadata-agent-cluster-level

    For more details look into addon-resizer Documentation.

    0 讨论(0)
提交回复
热议问题