How to fix “Task attempt_201104251139_0295_r_000006_0 failed to report status for 600 seconds.”

后端 未结 5 699
栀梦
栀梦 2021-01-31 04:25

I wrote a mapreduce job to extract some info from a dataset. The dataset is users\' rating about movies. The number of users is about 250K and the number of movies is about 300k

相关标签:
5条回答
  • 2021-01-31 04:49

    The easiest another way is to set in your Job Configuration inside the program

     Configuration conf=new Configuration();
     long milliSeconds = 1000*60*60; <default is 600000, likewise can give any value)
     conf.setLong("mapred.task.timeout", milliSeconds);
    

    **before setting it please check inside the Job file(job.xml) file in jobtracker GUI about the correct property name whether its mapred.task.timeout or mapreduce.task.timeout . . . while running the job check in the Job file again whether that property is changed according to the setted value.

    0 讨论(0)
  • 2021-01-31 04:57

    The easiest way will be to set this configuration parameter:

    <property>
      <name>mapred.task.timeout</name>
      <value>1800000</value> <!-- 30 minutes -->
    </property>
    

    in mapred-site.xml

    0 讨论(0)
  • 2021-01-31 05:01

    From https://issues.apache.org/jira/browse/HADOOP-1763

    causes might be :

    1. Tasktrackers run the maps successfully
    2. Map outputs are served by jetty servers on the TTs.
    3. All the reduce tasks connects to all the TT where maps are run. 
    4. since there are lots of reduces wanting to connect the map output server, the jetty servers run out of threads (default 40)
    5. tasktrackers continue to make periodic heartbeats to JT, so that they are not dead, but their jetty servers are (temporarily) down.
    
    0 讨论(0)
  • 2021-01-31 05:05

    If you have hive query and its timing out , you can set above configurations in following way:

    set mapred.tasktracker.expiry.interval=1800000;

    set mapred.task.timeout= 1800000;

    0 讨论(0)
  • 2021-01-31 05:10

    In newer versions, the name of the parameter has been changed to mapreduce.task.timeout as described in this link (search for task.timeout). In addition, you can also disable this timeout as described in the above link:

    The number of milliseconds before a task will be terminated if it neither reads an input, writes an output, nor updates its status string. A value of 0 disables the timeout.

    Below is an example setting in the mapred-site.xml:

    <property>
      <name>mapreduce.task.timeout</name>
      <value>0</value> <!-- A value of 0 disables the timeout -->
    </property>
    
    0 讨论(0)
提交回复
热议问题