I have a Spark job running on a distributed Spark Cluster with 10 nodes. I am doing some text file processing on HDFS. The job runs fine, until the last ste
I had same issue, it turned out that my Spark worker was running as root user and my job was running as another user, so when calling saveAsTextFile
, Spark worker first save the data to a temporary location on disk as root user, then the Spark job, which was running as different user, tries to move the temporary data owned by root to a final location, will have a permission issue.