Recursively counting files in a Linux directory

后端 未结 21 1022
既然无缘
既然无缘 2020-11-28 17:17

How can I recursively count files in a Linux directory?

I found this:

find DIR_NAME -type f ¦ wc -l

But when I run this it returns

相关标签:
21条回答
  • 2020-11-28 17:49

    Since filenames in UNIX may contain newlines (yes, newlines), wc -l might count too many files. I would print a dot for every file and then count the dots:

    find DIR_NAME -type f -printf "." | wc -c
    
    0 讨论(0)
  • 2020-11-28 17:51

    For directories with spaces in the name ... (based on various answers above) -- recursively print directory name with number of files within:

    find . -mindepth 1 -type d -print0 | while IFS= read -r -d '' i ; do echo -n $i": " ; ls -p "$i" | grep -v / | wc -l ; done
    

    Example (formatted for readability):

    pwd
      /mnt/Vancouver/Programming/scripts/claws/corpus
    
    ls -l
      total 8
      drwxr-xr-x 2 victoria victoria 4096 Mar 28 15:02 'Catabolism - Autophagy; Phagosomes; Mitophagy'
      drwxr-xr-x 3 victoria victoria 4096 Mar 29 16:04 'Catabolism - Lysosomes'
    
    ls 'Catabolism - Autophagy; Phagosomes; Mitophagy'/ | wc -l
      138
    
    ## 2 dir (one with 28 files; other with 1 file):
    ls 'Catabolism - Lysosomes'/ | wc -l
      29
    

    The directory structure is better visualized using tree:

    tree -L 3 -F .
      .
      ├── Catabolism - Autophagy; Phagosomes; Mitophagy/
      │   ├── 1
      │   ├── 10
      │   ├── [ ... SNIP! (138 files, total) ... ]
      │   ├── 98
      │   └── 99
      └── Catabolism - Lysosomes/
          ├── 1
          ├── 10
          ├── [ ... SNIP! (28 files, total) ... ]
          ├── 8
          ├── 9
          └── aaa/
              └── bbb
    
      3 directories, 167 files
    
    man find | grep mindep
      -mindepth levels
        Do not apply any tests or actions at levels less than levels
        (a non-negative integer).  -mindepth 1 means process all files
        except the starting-points.
    

    ls -p | grep -v / (used below) is from answer 2 at https://unix.stackexchange.com/questions/48492/list-only-regular-files-but-not-directories-in-current-directory

    find . -mindepth 1 -type d -print0 | while IFS= read -r -d '' i ; do echo -n $i": " ; ls -p "$i" | grep -v / | wc -l ; done
    ./Catabolism - Autophagy; Phagosomes; Mitophagy: 138
    ./Catabolism - Lysosomes: 28
    ./Catabolism - Lysosomes/aaa: 1
    

    Applcation: I want to find the max number of files among several hundred directories (all depth = 1) [output below again formatted for readability]:

    date; pwd
        Fri Mar 29 20:08:08 PDT 2019
        /home/victoria/Mail/2_RESEARCH - NEWS
    
    time find . -mindepth 1 -type d -print0 | while IFS= read -r -d '' i ; do echo -n $i": " ; ls -p "$i" | grep -v / | wc -l ; done > ../../aaa
        0:00.03
    
    [victoria@victoria 2_RESEARCH - NEWS]$ head -n5 ../../aaa
        ./RNA - Exosomes: 26
        ./Cellular Signaling - Receptors: 213
        ./Catabolism - Autophagy; Phagosomes; Mitophagy: 138
        ./Stress - Physiological, Cellular - General: 261
        ./Ancient DNA; Ancient Protein: 34
    
    [victoria@victoria 2_RESEARCH - NEWS]$ sed -r 's/(^.*): ([0-9]{1,8}$)/\2: \1/g' ../../aaa | sort -V | (head; echo ''; tail)
    
        0: ./Genomics - Gene Drive
        1: ./Causality; Causal Relationships
        1: ./Cloning
        1: ./GenMAPP 2
        1: ./Pathway Interaction Database
        1: ./Wasps
        2: ./Cellular Signaling - Ras-MAPK Pathway
        2: ./Cell Death - Ferroptosis
        2: ./Diet - Apples
        2: ./Environment - Waste Management
    
        988: ./Genomics - PPM (Personalized & Precision Medicine)
        1113: ./Microbes - Pathogens, Parasites
        1418: ./Health - Female
        1420: ./Immunity, Inflammation - General
        1522: ./Science, Research - Miscellaneous
        1797: ./Genomics
        1910: ./Neuroscience, Neurobiology
        2740: ./Genomics - Functional
        3943: ./Cancer
        4375: ./Health - Disease 
    

    sort -V is a natural sort. ... So, my max number of files in any of those (Claws Mail) directories is 4375 files. If I left-pad (https://stackoverflow.com/a/55409116/1904943) those filenames -- they are all named numerically, starting with 1, in each directory -- and pad to 5 total digits, I should be ok.


    Addendum

    Find the total number of files, subdirectories in a directory.

    $ date; pwd
    Tue 14 May 2019 04:08:31 PM PDT
    /home/victoria/Mail/2_RESEARCH - NEWS
    
    $ ls | head; echo; ls | tail
    Acoustics
    Ageing
    Ageing - Calorie (Dietary) Restriction
    Ageing - Senescence
    Agriculture, Aquaculture, Fisheries
    Ancient DNA; Ancient Protein
    Anthropology, Archaeology
    Ants
    Archaeology
    ARO-Relevant Literature, News
    
    Transcriptome - CAGE
    Transcriptome - FISSEQ
    Transcriptome - RNA-seq
    Translational Science, Medicine
    Transposons
    USACEHR-Relevant Literature
    Vaccines
    Vision, Eyes, Sight
    Wasps
    Women in Science, Medicine
    
    $ find . -type f | wc -l
    70214    ## files
    
    $ find . -type d | wc -l
    417      ## subdirectories
    
    0 讨论(0)
  • 2020-11-28 17:52

    To determine how many files there are in the current directory, put in ls -1 | wc -l. This uses wc to do a count of the number of lines (-l) in the output of ls -1. It doesn't count dotfiles. Please note that ls -l (that's an "L" rather than a "1" as in the previous examples) which I used in previous versions of this HOWTO will actually give you a file count one greater than the actual count. Thanks to Kam Nejad for this point.

    If you want to count only files and NOT include symbolic links (just an example of what else you could do), you could use ls -l | grep -v ^l | wc -l (that's an "L" not a "1" this time, we want a "long" listing here). grep checks for any line beginning with "l" (indicating a link), and discards that line (-v).

    Relative speed: "ls -1 /usr/bin/ | wc -l" takes about 1.03 seconds on an unloaded 486SX25 (/usr/bin/ on this machine has 355 files). "ls -l /usr/bin/ | grep -v ^l | wc -l" takes about 1.19 seconds.

    Source: http://www.tldp.org/HOWTO/Bash-Prompt-HOWTO/x700.html

    0 讨论(0)
提交回复
热议问题