I was using Hadoop in a pseudo-distributed mode and everything was working fine. But then I had to restart my computer because of some reason. And now when I am trying to st
I ran $hadoop namenode
to start namenode manually at foreground.
From the logs I figured out that 50070 is ocuupied, which was defaultly used by dfs.namenode.http-address. After configuring dfs.namenode.http-address in hdfs-site.xml, everything went well.
In conf/hdfs-site.xml, you should have a property like
<property>
<name>dfs.name.dir</name>
<value>/home/user/hadoop/name/data</value>
</property>
The property "dfs.name.dir" allows you to control where Hadoop writes NameNode metadata. And giving it another dir rather than /tmp makes sure the NameNode data isn't being deleted when you reboot.
If you kept default configurations when running hadoop the port for the namenode would be 50070. You will need to find any processes running on this port and kill them first.
Stop all running hadoop with : bin/stop-all.sh
check all processes running in port 50070
sudo netstat -tulpn | grep :50070
#check any processes running in
port 50070, if there are any the / will
appear at the RHS of the output.
sudo kill -9 <process_id> #kill_the_process
.
sudo rm -r /app/hadoop/tmp
#delete the temp folder
sudo mkdir /app/hadoop/tmp
#recreate it
sudo chmod 777 –R /app/hadoop/tmp
(777 is given for this example purpose only)
bin/hadoop namenode –format
#format hadoop namenode
bin/start-all.sh
#start-all hadoop services
Refer this blog