What\'s the easiest way to find file associated with a block in HDFS given a block Name/ID
The long and painful way, assuming you have read access to all the files (and execute for the directories):
hadoop fsck / -files -blocks | grep blk_520275863902385418_1002 -B 20
Then scan back up from your block match to the previous file name:
/hadoop/mapred/system/jobtracker.info 4 bytes, 1 block(s): OK
0. blk_520275863902385418_1002 len=4 repl=1
In this case blk_5202... is part of the /hadoop/mapred/system/jobtracker.info
file
Programmatically, these isn't an interface to the name node that allows you to search by block ID, but you could look into the source for the secondary name node and see how it consolidates the edits - then experiment on the saved output from the secondary name node (rather than risking working on the live name node file).
Good luck!
Not sure when this was introduced but you can do this
hdfs fsck -blockId <block_id>
hdfs fsck -blockId blk_1100790203
Connecting to namenode
FSCK started by hdfs
Block Id: blk_1100790203
Block belongs to: /common/FFL1447685899336.txt