发表新帖

发表新帖

The way to check a HDFS directory's size?

前端未结

关注

 10  623

轻奢々 2021-01-30 12:14

I know du -sh in common Linux filesystems. But how to do that with HDFS?

10条回答

终归单人心 (楼主)

2021-01-30 12:55
When trying to calculate the total of a particular group of files within a directory the -s option does not work (in Hadoop 2.7.1). For example:

Directory structure:
```
some_dir
├abc.txt    
├count1.txt 
├count2.txt 
└def.txt    
```
Assume each file is 1 KB in size. You can summarize the entire directory with:
```
hdfs dfs -du -s some_dir
4096 some_dir
```
However, if I want the sum of all files containing "count" the command falls short.
```
hdfs dfs -du -s some_dir/count*
1024 some_dir/count1.txt
1024 some_dir/count2.txt
```
To get around this I usually pass the output through awk.
```
hdfs dfs -du some_dir/count* | awk '{ total+=$1 } END { print total }'
2048 
```
0 讨论(0)

查看其它10个回答
发布评论:

提交评论
- 加载中...

热议问题