How to submit code to a remote Spark cluster from IntelliJ IDEA

后端 未结 1 1692
天命终不由人
天命终不由人 2021-01-18 07:19

I have two clusters, one in local virtual machine another in remote cloud. Both clusters in Standalone mode.

My Environment:

Scala: 2.10.4
Spark: 1.5         


        
1条回答
  •  挽巷
    挽巷 (楼主)
    2021-01-18 07:58

    Submitting code programatically (e.g. via SparkSubmit) is quite tricky. At the least there is a variety of environment settings and considerations -handled by the spark-submit script - that are quite difficult to replicate within a scala program. I am still uncertain of how to achieve it: and there have been a number of long running threads within the spark developer community on the topic.

    My answer here is about a portion of your post: specifically the

    TaskSchedulerImpl: Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient resources

    The reason is typically there were a mismatch on the requested memory and/or number of cores from your job versus what were available on the cluster. Possibly when submitting from IJ the

    $SPARK_HOME/conf/spark-defaults.conf

    were not properly matching the parameters required for your task on the existing cluster. You may need to update:

    spark.driver.memory   4g
    spark.executor.memory   8g
    spark.executor.cores  8
    

    You can check the spark ui on port 8080 to verify that the parameters you requested are actually available on the cluster.

    0 讨论(0)
提交回复
热议问题