问题
When I submit spark job using spark-submit with master yarn and deploy-mode cluster, it doesn't print/return any applicationId and once job is completed I have to manually check MapReduce jobHistory or spark HistoryServer to get the job details.
My cluster is used by many users and it takes lot of time to spot my job in jobHistory/HistoryServer.
is there any way to configure spark-submit
to return the applicationId?
Note: I found many similar questions but their solutions retrieve applicationId within the driver code using sparkcontext.applicationId
and in case of master yarn and deploy-mode cluster
the driver also run as a part of mapreduce job, any logs or sysout printed to remote host log.
回答1:
Here are the approaches that I used to achieve this:
- Save the application Id to HDFS file. (Suggested by @zhangtong in comment).
- Send an email alert with applictionId from driver.
来源:https://stackoverflow.com/questions/44209462/spark-yarn-mode-how-to-get-applicationid-from-spark-submit