How Container failure is handled for a YARN MapReduce job?

穿精又带淫゛_ 提交于 2019-12-13 13:50:24

问题


How are software/hardware failures handled in YARN? Specifically, what happens in case of container(s) failure/crash?


回答1:


  • Container and task failures are handled by node-manager. When a container fails or dies, node-manager detects the failure event and launches a new container to replace the failing container and restart the task execution in the new container.
  • In the event of application-master failure, the resource-manager detects the failure and start a new instance of the application-master with a new container.

Find the details here




回答2:


  • App master will re-attempt task that complete with exception or stop responding ( 4 time by default ) _ Job with two many failed task are considered as failed job.


来源:https://stackoverflow.com/questions/30694747/how-container-failure-is-handled-for-a-yarn-mapreduce-job

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!