I know du -sh
in common Linux filesystems. But how to do that with HDFS?
Command Should be hadoop fs -du -s -h \dirPath
-du [-s] [-h] ... : Show the amount of space, in bytes, used by the files that match the specified file pattern.
-s : Rather than showing the size of each individual file that matches the
pattern, shows the total (summary) size.
-h : Formats the sizes of files in a human-readable fashion rather than a number of bytes. (Ex MB/GB/TB etc)
Note that, even without the -s option, this only shows size summaries one level deep into a directory.
The output is in the form size name(full path)
With this you will get size in GB
hdfs dfs -du PATHTODIRECTORY | awk '/^[0-9]+/ { print int($1/(1024**3)) " [GB]\t" $2 }'
To get the size of the directory hdfs dfs -du -s -h /$yourDirectoryName can be used. hdfs dfsadmin -report can be used to see a quick cluster level storage report.
hadoop version 2.3.33:
hadoop fs -dus /path/to/dir | awk '{print $2/1024**3 " G"}'