I am trying to learn Hadoop by following a tutorial and trying to do pseudo-distributed mode on my machine.
My core-site.xml
is:
Hadoop namenode -format
Hadoop namenode directory contains the fsimage and edit files which holds the basic information's about hadoop file system such as where is data available, which user created files like that
If you format the namenode then the above information's are deleted
from namenode directory which is specified in the hdfs-site.xml as dfs.namenode.name.dir
But you still have the datas on the hadoop but not namenode meta data
Namenode contains metadata about the Hadoop filesystem.
This command (hadoop-1.2.1$ bin/hadoop namenode -format) will format whole Hadoop distributed file system(HDFS). So if you run this command on existing filesystem you will lose all your data.
Steps
start all the services using "start-all.sh"
check the services are running or not using "JPS"
note: if you use hadoop2.3.0 then following services are need to run
Namenode
Datanode
Resourcemanager
Nodemanager
Move some file from local to HDFS using hdfs -put /
Now check at location "/tmp/hadoop-myuser/dfs/name" you may find this file split into some BLOCKS conatain 64 MB each.
Then start Formatting using **hadoop namenode -format**
Now the file is not available phisically on that location
Further information click here
Actually formatting a Namenode will not format the Datanode.
It will just format the contents of your namenode (which contains details of datanode). Your namenode will no longer know where your data is. Also namenode -format will assign a new namespace ID to the namenode
You have to change your namespaceID in your datanode to make your datanode work. This will be at dfs/data/current/VERSION
There is a JIRA open now for the same suggesting to format Datanode as well when you format Namenode. HDFS-107
hadoop namenode -format
this command deletes all files in your hdfs.
tmp directory contains two folders datanode, namenode in local filesystem. if you format the namenode these two folders becomes empty.
Note : if you want to format your namenode first stop all hadoop services then delete the tmp(contains namenode and datanode) folder in your local file system and start hadoop service surely it will take effect.
Reason for Hadoop namenode -format :
Hadoop NameNode is the centralized place of an HDFS file system which keeps the directory tree of all files in the file system, and tracks where across the cluster the file data is kept. In short, it keeps the metadata related to datanodes. When we format namenode it formats the meta-data related to data-nodes. By doing that, all the information on the datanodes are lost and they can be reused for new data.
By default the namenode location will be at "/tmp/hadoop-myuser/dfs/name"
While you formatting the namenode, this file location was cleared.
To change the namenode location add the follwing properties At hdfs-site.xml
<property>
<name>dfs.namenode.name.dir</name>
<value>file:/search/data/dfs/namenode</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>file:/search/data/dfs/datanode</value>
</property>
I hope this will help you.. :-)