File count in an HDFS directory

前端 未结 6 2000
攒了一身酷
攒了一身酷 2021-01-31 10:36

In Java code, I want to connect to a directory in HDFS, learn the number of files in that directory, get their names and want to read them. I can already read the files but I co

6条回答
  •  庸人自扰
    2021-01-31 11:04

    count

    Usage: hadoop fs -count [-q] 
    

    Count the number of directories, files and bytes under the paths that match the specified file pattern. The output columns are: DIR_COUNT, FILE_COUNT, CONTENT_SIZE FILE_NAME.

    The output columns with -q are: QUOTA, REMAINING_QUATA, SPACE_QUOTA, REMAINING_SPACE_QUOTA, DIR_COUNT, FILE_COUNT, CONTENT_SIZE, FILE_NAME.

    Example:

    hadoop fs -count hdfs://nn1.example.com/file1 hdfs://nn2.example.com/file2
    hadoop fs -count -q hdfs://nn1.example.com/file1
    

    Exit Code:

    Returns 0 on success and -1 on error.

    You can just use the FileSystem and iterate over the files inside the path. Here is some example code

    int count = 0;
    FileSystem fs = FileSystem.get(getConf());
    boolean recursive = false;
    RemoteIterator ri = fs.listFiles(new Path("hdfs://my/path"), recursive);
    while (ri.hasNext()){
        count++;
        ri.next();
    }
    

提交回复
热议问题