I would like to modify the default block placement strategy of HDFS to suit my application.
For example, I have two files say file1(128MB) and file2(128MB). Having th
The default behaviour of the block placement policy can be modified by extending the BlockPlacementPolicy interface and pointing the class to the dfs.block.replicator.classname property in the Hadoop configuration files.
Hadoop operations are not tied to a particular node, this makes the Hadoop more resilient to the inherent problems in distributed computing. What is the requirement for having blocks for two files on a particular node? With the requirement known a better solution can be found.