I have a setup, 2 node hadoop cluster on Ubuntu 12.04 and Hadoop 1.2.1.
While I am trying to run hadoop word count example I am gettig \"Too many fetch faliure error
If you're unable to upgrade the cluster for whatever reason, you can try the following:
/etc/hosts
set mapred.reduce.slowstart.completed.maps=0.80
set tasktracker.http.threads=80
set mapred.reduce.parallel.copies=(>= 10)(10 should probably be sufficient)
Also checkout this SO post: Why I am getting "Too many fetch-failures" every other day
And this one: Too many fetch failures: Hadoop on cluster (x2)
And also this if the above don't help: http://grokbase.com/t/hadoop/common-user/098k7y5t4n/how-to-deal-with-too-many-fetch-failures For brevity and in interest of time, I'm putting what I found to be the most pertinent here.
The number 1 cause of this is something that causes a connection to get a map output to fail. I have seen: 1) firewall 2) misconfigured ip addresses (ie: the task tracker attempting the fetch received an incorrect ip address when it looked up the name of the tasktracker with the map segment) 3) rare, the http server on the serving tasktracker is overloaded due to insufficient threads or listen backlog, this can happen if the number of fetches per reduce is large and the number of reduces or the number of maps is very large.
There are probably other cases, this recently happened to me when I had 6000 maps and 20 reducers on a 10 node cluster, which I believe was case 3 above. Since I didn't actually need to reduce ( I got my summary data via counters in the map phase) I never re-tuned the cluster.
EDIT: Original answer said "Ensure that your hostname is bound to the network IP and 127.0.0.1 in /etc/hosts
"