Why does Spark job fails to write output?

后端 未结 2 1282
小鲜肉
小鲜肉 2021-01-12 14:38

Setup:

I have a Spark job running on a distributed Spark Cluster with 10 nodes. I am doing some text file processing on HDFS. The job runs fine, until the last ste

2条回答
  •  傲寒
    傲寒 (楼主)
    2021-01-12 15:08

    I had same issue, it turned out that my Spark worker was running as root user and my job was running as another user, so when calling saveAsTextFile, Spark worker first save the data to a temporary location on disk as root user, then the Spark job, which was running as different user, tries to move the temporary data owned by root to a final location, will have a permission issue.

提交回复
热议问题