问题
I have a silly doubt on Hadoop namenode memory calculation.It is mentioned in Hadoop book (Definite guide) as
"Since the namenode holds filesystem metadata in memory, the limit to the number of files in a filesystem is governed by the amount of memory on the namenode. As a rule of thumb, each file, directory, and block takes about 150 bytes. So, for example, if you had one million files, each taking one block, you would need at least 300 MB of memory. While storing millions of files is feasible, billions is beyond the capability of current hardware."
Since each taking one block, namenode minimum memory should be 150MB and not 300MB.Please help me to understand why it is 300MB
回答1:
I guess you read the second edition of Tom White's book. I have the third edition, and this reference to a post Scalability of the Hadoop Distributed File System. Into the post, I read the next sentence:
Estimates show that the name-node uses less than 200 bytes to store a single metadata object (a file inode or a block).
A file in HDFS NameNode is: A file inode + a block. Each reference to both have 150 bytes. 1.000.000 of files = 1.000.000 inodes + 1.000.000 block reference (In the example, each file occupied 1 block).
2.000.000 * 150 bytes ~= 300Mb
I put the link for you can verify if I commit a mistake in my argumentation.
来源:https://stackoverflow.com/questions/28211548/confusion-over-hadoop-namenode-memory-usage