The question may seem pretty obvious, but I have faced it many times, due to bad configuration of hosts file on a hadoop cluster.
Can anyone describe how to setup ho
If you mean the /etc/hosts
file, then here is how I have set it in my hadoop cluster:
127.0.0.1 localhost
192.168.0.5 master
192.168.0.6 slave1
192.168.0.7 slave2
192.168.0.18 slave3
192.168.0.3 slave4
192.168.0.4 slave5 nameOfCurrentMachine
, where nameOfCurrentMachine
is the machine that this file is set, used as slave5
.
Some people say that the first line should be removed, but I have not faced any issues, nor have I tried removing it.
Then, the $HADOOP_CONF_DIR/masters
file in the master node should be:
master
and the $HADOOP_CONF_DIR/slaves
file in the master node should be:
slave1
slave2
slave3
slave4
slave5
In every other node, I have simply set these two files to contain just:
localhost
You should also make sure that you can ssh from master to every other node (using its name, not its IP) without a password. This post describes how to achieve that.