Container killed by YARN for exceeding memory limits

好久不见. 提交于 2019-12-10 18:22:17

问题


I am creating a cluster in google dataproc with the following characteristics:

Master Standard (1 master, N workers)
  Machine       n1-highmem-2 (2 vCPU, 13.0 GB memory)
  Primary disk  250 GB

Worker nodes    2
  Machine type  n1-highmem-2 (2 vCPU, 13.0 GB memory)
  Primary disk  size    250 GB

I am also adding in Initialization actions the .sh file from this repository in order to use zeppelin.

The code that I use works fine with some data but if I use bigger amount of, I got the following error:

Container killed by YARN for exceeding memory limits. 4.0 GB of 4 GB physical memory used. Consider boosting spark.yarn.executor.memoryOverhead.

I have seen posts such as this one: Container killed by YARN for exceeding memory... where it is recommended to change yarn.nodemanager.vmem-check-enabled to false.

I am a bit confused though. Are all these configurations happening when I initialize the cluster or not?

Also where exactly is yarn-site.xml located? I am unable to find it in the master(cant find it in /usr/lib/zeppelin/conf/, /usr/lib/spark/conf, /usr/lib/hadoop-yar/) in order to change it, and if changed what do i need to 'restart'?


回答1:


Igor is correct, the easiest thing to do is create a cluster and specify any additional properties to set before starting the services.

However, it's a little scary to entirely disable YARN checking that containers stay within their bounds. Either way, your VM will eventually run out of memory.

The error message is correct -- you should try bumping up spark.yarn.executor.memoryOverhead. It defaults to max(384m, 0.1 * spark.executor.memory). On an n1-highmem-2, that ends up being 384m since spark.executor.memory=3712m. You can set this value when creating a cluster by using --properties spark:spark.yarn.executor.memoryOverhead=512m.

If I understand correctly, the JVM and Spark try to be intelligent about keeping memory usage within spark.executor.memory - memoryOverhead. However, the python interpreter (where your pyspark code actually runs) is outside their accounting, and instead falls under memoryOverhead. If you are using a lot of memory in the python process, you will need to increase memoryOverhead.

Here are some resources on pyspark and Spark's memory management:

  • How does Spark running on YARN account for Python memory usage?
  • https://spoddutur.github.io/spark-notes/distribution_of_executors_cores_and_memory_for_spark_application.html
  • http://spark.apache.org/docs/latest/tuning.html#memory-management-overview


来源:https://stackoverflow.com/questions/50587413/container-killed-by-yarn-for-exceeding-memory-limits

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!