Too many fetch faliuers

后端未结

关注

 1  1284

情歌与酒 2021-01-23 23:29

I have a setup, 2 node hadoop cluster on Ubuntu 12.04 and Hadoop 1.2.1. While I am trying to run hadoop word count example I am gettig \"Too many fetch faliure error

1条回答

爱一瞬间的悲伤 (楼主)

2021-01-24 00:02
If you're unable to upgrade the cluster for whatever reason, you can try the following:
1. Ensure that your hostname is bound to the network IP and NOT 127.0.0.1 in /etc/hosts
2. Ensure that you're using only hostnames and not IPs to reference services.
3. If the above are correct, try the following settings:
```
set mapred.reduce.slowstart.completed.maps=0.80
set tasktracker.http.threads=80
set mapred.reduce.parallel.copies=(>= 10)(10 should probably be sufficient)
```
Also checkout this SO post: Why I am getting "Too many fetch-failures" every other day

And this one: Too many fetch failures: Hadoop on cluster (x2)

And also this if the above don't help: http://grokbase.com/t/hadoop/common-user/098k7y5t4n/how-to-deal-with-too-many-fetch-failures For brevity and in interest of time, I'm putting what I found to be the most pertinent here.

The number 1 cause of this is something that causes a connection to get a map output to fail. I have seen: 1) firewall 2) misconfigured ip addresses (ie: the task tracker attempting the fetch received an incorrect ip address when it looked up the name of the tasktracker with the map segment) 3) rare, the http server on the serving tasktracker is overloaded due to insufficient threads or listen backlog, this can happen if the number of fetches per reduce is large and the number of reduces or the number of maps is very large.

There are probably other cases, this recently happened to me when I had 6000 maps and 20 reducers on a 10 node cluster, which I believe was case 3 above. Since I didn't actually need to reduce ( I got my summary data via counters in the map phase) I never re-tuned the cluster.

EDIT: Original answer said "Ensure that your hostname is bound to the network IP and 127.0.0.1 in /etc/hosts"
0 讨论(0)
发布评论:

提交评论
- 加载中...