Confusion over Hadoop namenode memory usage

落爺英雄遲暮 提交于 2019-12-22 11:46:25

问题


I have a silly doubt on Hadoop namenode memory calculation.It is mentioned in Hadoop book (Definite guide) as

"Since the namenode holds filesystem metadata in memory, the limit to the number of files in a filesystem is governed by the amount of memory on the namenode. As a rule of thumb, each file, directory, and block takes about 150 bytes. So, for example, if you had one million files, each taking one block, you would need at least 300 MB of memory. While storing millions of files is feasible, billions is beyond the capability of current hardware."

Since each taking one block, namenode minimum memory should be 150MB and not 300MB.Please help me to understand why it is 300MB


回答1:


I guess you read the second edition of Tom White's book. I have the third edition, and this reference to a post Scalability of the Hadoop Distributed File System. Into the post, I read the next sentence:

Estimates show that the name-node uses less than 200 bytes to store a single metadata object (a file inode or a block).

A file in HDFS NameNode is: A file inode + a block. Each reference to both have 150 bytes. 1.000.000 of files = 1.000.000 inodes + 1.000.000 block reference (In the example, each file occupied 1 block).

2.000.000 * 150 bytes ~= 300Mb

I put the link for you can verify if I commit a mistake in my argumentation.



来源:https://stackoverflow.com/questions/28211548/confusion-over-hadoop-namenode-memory-usage

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!