HDFS replication factor

后端 未结 4 960
情深已故
情深已故 2021-02-05 13:16

When I\'m uploading a file to HDFS, if I set the replication factor to 1 then the file splits gonna reside on one single machine or the splits would be distributed to multiple m

4条回答
  •  猫巷女王i
    2021-02-05 14:17

    • If your cluster is single node then when you upload a file it will be spilled according to the blocksize and it remains in single machine.
    • If your cluster is Multi node then when you upload a file it will be spilled according to the blocksize and it will be distributed to different datanode in your cluster via pipeline and NameNode will decide where the data should be moved in the cluster.

    HDFS replication factor is used to make a copy of the data (i.e) if your replicator factor is 2 then all the data which you upload to HDFS will have a copy.

提交回复
热议问题