I\'m struggling to setup a Hbase distributed cluster with 2 nodes, one is my machine and one is the VM, using the \"host-only\" Adapter in VirtualBox.
My problem is that
The answer that @Infinity provided seems to belong to version ~0.9.4.
For version 1.1.4.
according to the source code from
org.apache.hadoop.hbase.master.HMaster
the configuration should be:
<property>
<name>hbase.master.hostname</name>
<value>master.local</value>
<!-- master.local is the DNS name in my network pointing to hbase master -->
</property>
After setting this value, region servers are able to connect to hbase master; however, in my environment, the region server complained about:
com.google.protobuf.ServiceException: java.net.SocketException: Invalid argument
The problem disappeared after I installed oracle JDK 8 instead of open-jdk-7 in all of my nodes.
So in conclusion, here is my solution:
use dns name server instead of setting /etc/hosts, as hbase is very picky on hostname and seems requires DNS lookup as well as reverse DNS lookup.
upgrade jdk to oracle 8
use the setting item mentioned above.
My host file is like
127.0.0.1 localhost
192.168.2.118 shashwat.machine.com shashwat
make your hosts file as following:
127.0.0.1 localhost
192.168.56.1 master
192.168.56.101 slave
and in hbase conf put following entries :
<property>
<name>hbase.rootdir</name>
<value>hdfs://master:9000/hbase</value>
</property>
<property>
<name>hbase.master</name>
<value>master:60000</value>
<description>The host and port that the HBase master runs at.</description>
</property>
<property>
<name>hbase.regionserver.port</name>
<value>60020</value>
<description>The host and port that the HBase master runs at.</description>
<property>
<name>hbase.cluster.distributed</name>
<value>true</value>
</property>
<property>
<name>hbase.tmp.dir</name>
<value>/home/cluster/Hadoop/hbase-0.90.4/temp</value>
</property>
<property>
<name>hbase.zookeeper.quorum</name>
<value>master</value>
</property>
<property>
<name>dfs.replication</name>
<value>2</value>
</property>
<property>
<name>hbase.zookeeper.property.clientPort</name>
<value>2181</value>
<description>Property from ZooKeeper's config zoo.cfg.
The port at which the clients will connect.
</description>
</property>
If you are using localhost anywhere remove that and replace it with "master" which is name for namenode in your hostfile....
one morething you can do
sudo gedit /etc/hostname
this will open the hostname file bydefault ubuntu will be there so make it master. and restart your system.
For hbase specify in "regionserver" file inside conf dir put these entries
master slave
and restart.everything.
Most of the time the error is coming from Zookeeper that send a wrong hostname.
You can check what Zookeeper sends as HBase master host:
Find Zookeeper bin folder:
bin/zkCli.sh -server 127.0.0.1:2181
get /hbase/master
This should give you the HBase master IP that answer Zookeeper, so this IP must be accessible.
There are two things that fix this class of problem for me:
1) Remove all "localhost" names, only have 127.0.0.1 pointing to the name of the hmaster node.
2) run "hostname X" on your hbase master node, to make sure the hostname matches what is in /etc/hosts.
Not being a networking expert, I can't say why this is important, but it is :)