HDFS Replication - Data Stored

前端 未结 1 1639
执念已碎
执念已碎 2021-01-14 23:45

I am a relative newbie to hadoop and want to get a better understanding of how replication works in HDFS.

Say that I have a 10 node system(1 TB each node), giving me

1条回答
  •  礼貌的吻别
    2021-01-15 00:16

    Your thinking is a little off. A replication factor of 3 means that you have 3 total copies of your data. More specifically, there will be 3 copies of each block for your file, so if your file is made up of 10 blocks there will be 30 total blocks across your 10 nodes, or about 3 blocks per node.

    You are correct in thinking that a 10x1TB cluster has less than 10TB capacity- with a replication factor of 3, it actually has a functional capacity of about 3.3TB, with a little less actual capacity because of space needed for doing any processing, holding temporary files, etc.

    0 讨论(0)
提交回复
热议问题