The reduce fails due to Task attempt failed to report status for 600 seconds. Killing! Solution?

前端 未结 2 743
伪装坚强ぢ
伪装坚强ぢ 2021-02-06 11:10

The reduce phase of the job fails with:

of failed Reduce Tasks exceeded allowed limit.

The reason why each task fails is:

Task attempt_201301251556_163

2条回答
  •  -上瘾入骨i
    2021-02-06 11:14

    The reason for the timeouts might be a long-running computation in your reducer without reporting the progress back to the Hadoop framework. This can be resolved using different approaches:

    I. Increasing the timeout in mapred-site.xml:

    
      mapred.task.timeout
      1200000
    
    

    The default is 600000 ms = 600 seconds.

    II. Reporting progress every x records as in the Reducer example in javadoc:

    public void reduce(K key, Iterator values,
                              OutputCollector output, 
                              Reporter reporter) throws IOException {
       // report progress
       if ((noValues%10) == 0) {
         reporter.progress();
       }
    
       // ...
    }
    

    optionally you can increment a custom counter as in the example:

    reporter.incrCounter(NUM_RECORDS, 1);
    

提交回复
热议问题