AWS EMR Error : All slaves in the job flow were terminated

折月煮酒 提交于 2020-07-10 06:37:33

问题


I am using Elastic Mapreduce infrastructure on Amazon AWS. A jowflow got terminated automatically. Last state change reason according Amazon Console is : "All slaves in the job flow were terminated".

Create jobflow command :

elastic-mapreduce --create --name MyCluster --alive --instance-group master --instance-type m1.xlarge --instance-count 1 --bid-price 2.0 --instance-group core --instance-type m1.xlarge --instance-count 10 --bid-price 2.0 --hive-interactive  --enable-debugging

Details about jobflow : enter image description here

Last few lines of log ...

Total MapReduce jobs = 2
Launching Job 1 out of 2
Number of reduce tasks not specified. Estimated from input data size: 1
In order to change the average load for a reducer (in bytes):
  set hive.exec.reducers.bytes.per.reducer=<number>
In order to limit the maximum number of reducers:
  set hive.exec.reducers.max=<number>
In order to set a constant number of reducers:
  set mapred.reduce.tasks=<number>
Starting Job = job_201310231204_0099, Tracking URL = http://ip-10-197-16-105.us-west-1.compute.internal:9100/jobdetails.jsp?jobid=job_201310231204_0099
Kill Command = /home/hadoop/bin/hadoop job  -Dmapred.job.tracker=10.197.16.105:9001 -kill job_201310231204_0099
2013-10-23 14:11:38,618 Stage-1 map = 0%,  reduce = 0%
2013-10-23 14:11:48,741 Stage-1 map = 100%,  reduce = 0%

As you can see above in logs, no error is thrown as such.

What I think the reason is

I think that this happened because of sudden increase in price od spot instances. More details in my answer below.


回答1:


Here I am answering my own question.

I think that this happened because of sudden increase in price of spot instances. My bid price was $2 per instance per hour for a m1.xlarge instance.

Snapshot of aws console spot instance pricing:

snapshot of aws console spot instance pricing

You can notice the blue jumps in pricing. My bid was $2 and the actual price jumped to $11. Hence my cluster was killed automatically.



来源:https://stackoverflow.com/questions/19550281/aws-emr-error-all-slaves-in-the-job-flow-were-terminated

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!