发表新帖

发表新帖

How does Apache Spark handles system failure when deployed in YARN?

后端未结

关注

 1  1949

渐次进展 2021-02-06 07:21

Preconditions

Let\'s assume Apache Spark is deployed on a hadoop cluster using YARN. Furthermore a spark execution is running. How does spark handle the

1条回答

臣服心动 (楼主)

2021-02-06 07:51
Here are the answers given by the mailing list to the questions (answers where provided by Sandy Ryza of Cloudera):
1. "Spark will rerun those tasks on a different node."
2. "After a number of failed task attempts trying to read the block, Spark would pass up whatever error HDFS is returning and fail the job."
3. "Spark accesses HDFS through the normal HDFS client APIs. Under an HA configuration, these will automatically fail over to the new namenode. If no namenodes are left, the Spark job will fail."
4. Restart is part of administration and "Spark has support for checkpointing to HDFS, so you would be able to go back to the last time checkpoint was called that HDFS was available."
0 讨论(0)
发布评论:

提交评论
- 加载中...

热议问题