Spark Job submitted - Waiting (TaskSchedulerImpl : Initial job not accepted)

后端未结

关注

 2  578

借酒劲吻你

API call made to submit the Job. Response states - It is Running

On Cluster UI -

Worker (slave) - worker-20160712083825-172.31.17.189-59433 i

相关标签:

2条回答

攒了一身酷

2020-12-06 03:23

You can take a look at my answer in a similar question Apache Spark on Mesos: Initial job has not accepted any resources:

While most of other answers focuses on resource allocation (cores, memory) on spark slaves, I would like to highlight that firewall could cause exactly the same issue, especially when you are running spark on cloud platforms.

If you can find spark slaves in the web UI, you have probably opened the standard ports 8080, 8081, 7077, 4040. Nonetheless, when you actually run a job, it uses SPARK_WORKER_PORT, spark.driver.port and spark.blockManager.port which by default are randomly assigned. If your firewall is blocking these ports, the master could not retrieve any job-specific response from slaves and return the error.

You can run a quick test by opening all the ports and see whether the slave accepts jobs.

0 讨论(0)
发布评论:

提交评论
- 加载中...
长发绾君心

2020-12-06 03:35
I also have the same issue. Below are my remarks when it occurs.

1:17:46 WARN TaskSchedulerImpl: Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient resources

I noticed that it only occurs during the first query from scala shell where I run something fetching data from hdfs.

When the problem occurs, the webui states that there's not any running applications.
```
URL: spark://spark1:7077
REST URL: spark://spark1:6066 (cluster mode)
Alive Workers: 4
Cores in use: 26 Total, 26 Used
Memory in use: 52.7 GB Total, 4.0 GB Used
Applications: 0 Running, 0 Completed
Drivers: 0 Running, 0 Completed 
Status: ALIVE
```
It seems that something fails to start , I can't tell exactly which it is.

However restarting the cluster a second time sets the Applications value to 1 and everything works well.
```
URL: spark://spark1:7077
REST URL: spark://spark1:6066 (cluster mode)
Alive Workers: 4
Cores in use: 26 Total, 26 Used
Memory in use: 52.7 GB Total, 4.0 GB Used
Applications: 1 Running, 0 Completed
Drivers: 0 Running, 0 Completed
Status: ALIVE
```
I'm still investigate, this quick workaround can save times till final solution.
0 讨论(0)
发布评论:

提交评论
- 加载中...