org.apache.spark.SparkException: Job aborted due to stage failure: Task from application

后端 未结 2 623
攒了一身酷
攒了一身酷 2021-02-13 03:43

I have a problem with running spark application on standalone cluster. (I use spark 1.1.0 version). I succesfully run master server by command:

bash start-master         


        
2条回答
  •  死守一世寂寞
    2021-02-13 04:21

    Found a way to run it using IDE / Maven

    1. Create a Fat Jar ( One which includes all dependencies ). Use Shade Plugin for this. Example pom :
    
        org.apache.maven.plugins
        maven-shade-plugin
        2.2
        
            
                
                    *:*
                    
                        META-INF/*.SF
                        META-INF/*.DSA
                        META-INF/*.RSA
                    
                
            
        
        
            
                job-driver-jar
                package
                
                    shade
                
                
                    true
                    driver
                    
                        
                        
                        
                            reference.conf
                        
                        
                            mainClass
                        
                    
                
            
            
                worker-library-jar
                package
                
                    shade
                
                
                    true
                    worker
                    
                        
                    
                
            
        
    
    
    1. Now we have to send the compiled jar file to the cluster. For this, specify the jar file in the spark config like this :

    SparkConf conf = new SparkConf().setAppName("appName").setMaster("spark://machineName:7077").setJars(new String[] {"target/appName-1.0-SNAPSHOT-driver.jar"});

    1. Run mvn clean package to create the Jar file. It will be created in your target folder.

    2. Run using your IDE or using maven command :

    mvn exec:java -Dexec.mainClass="className"

    This does not require spark-submit. Just remember to package file before running

    If you don't want to hardcode the jar path, you can do this :

    1. In the config, write :

    SparkConf conf = new SparkConf() .setAppName("appName") .setMaster("spark://machineName:7077") .setJars(JavaSparkContext.jarOfClass(this.getClass()));

    1. Create the fat jar ( as above ) and run using maven after running package command :

    java -jar target/application-1.0-SNAPSHOT-driver.jar

    This will take the jar from the jar the class was loaded.

提交回复
热议问题