AWS EMR Spark: Error: Cannot load main class from JAR

▼魔方 西西 提交于 2019-12-10 20:44:07

问题


I am trying to submit a spark job to AWS EMR cluster using AWS console. But it fails with:

Cannot load main class from JAR. The job runs successfully when I specify main class as --class in Arguments option in AWS EMR Console-> Add Step.

On the local machine, the job seems to work perfectly fine when no main class is specified as below:

 ./spark-submit /home/astro/spark-programs/SpotEMR/MyJob.jar

I have set main class to jar using run configuration. The main reason to avoid passing main class as --class is, I have to run this job in AWS Datapipeline using EMRAcivity. In AWS Datapipeline, currently there is no way to specify a main class to a job being submitted.

Any help will be appreciated.


回答1:


Actually, you can pass the job's main class with EMRActivity and AWS Datapipeline.

See https://docs.aws.amazon.com/datapipeline/latest/DeveloperGuide/dp-object-emractivity.html to launch a EMRActivity using step.

as well as https://docs.aws.amazon.com/emr/latest/ReleaseGuide/emr-spark-submit-step.html to submit a spark job using an EMR step with a main class.

The step would look as follows:

command-runner.jar,spark-submit,--class,org.apache.spark.examples.SparkPi


来源:https://stackoverflow.com/questions/48407769/aws-emr-spark-error-cannot-load-main-class-from-jar

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!