As a beginner to hadoop I am confused with these words namespace and metadata. Is there any relation between these two ?
According to 'Hadoop The definitive guide' - "The NameNode manages the filesystem namespace. It maintains the filesystem tree and the metadata for all the files and directories in the tree."
Essentially, Namespace means a container. In this context it means the file name grouping or hierarchy structure.
Metadata contains things like the owners of files, permission bits, block location, size etc.
to make things easier/clearer and since HDFS is another file system, we can give an example of windows file system :
suppose that you have a file : test.txt in in this path C:\User\Test\New Folder\Test.txt.
so in the case of windows this path is the namespace .
now if go to the properties of this file, you will find some information (creation date, last modification, owner...)
those information are the meta data, it called like that because it represent a superior level of abstraction ( the data is the content of the file, and the meta data is the description of the file it self).
identically we can make use the same example for HDFS, therefore, the namespace is the path to access a block of data, and the meta-data are information about that block it self
Namespace is nothing but a term we use to describe the tree structure of a filesystem.
Basically when we say namespace we mean a certain location on the hdfs.
‘/’ or ‘root’ dir is a namespace. The folder /user is a namespace. In Hadoop we refer to a Namespace as a dir which is handled by the NameNode.
ref:https://www.quora.com/What-is-%E2%80%98Namespace%E2%80%99-in-HDFS-and-what-would-be-the-contents-residing-in-a-%E2%80%98Namespace%E2%80%99