How to exclude a directory in find . command

后端 未结 30 1361
醉酒成梦
醉酒成梦 2020-11-22 03:36

I\'m trying to run a find command for all JavaScript files, but how do I exclude a specific directory?

Here is the find code we\'re using.<

相关标签:
30条回答
  • 2020-11-22 04:09

    TLDR: understand your root directories and tailor your search from there, using the -path <excluded_path> -prune -o option. Do not include a trailing / at the end of the excluded path.

    Example:

    find / -path /mnt -prune -o -name "*libname-server-2.a*" -print


    To effectively use the find I believe that it is imperative to have a good understanding of your file system directory structure. On my home computer I have multi-TB hard drives, with about half of that content backed up using rsnapshot (i.e., rsync). Although backing up to to a physically independent (duplicate) drive, it is mounted under my system root (/) directory: /mnt/Backups/rsnapshot_backups/:

    /mnt/Backups/
    └── rsnapshot_backups/
        ├── hourly.0/
        ├── hourly.1/
        ├── ...
        ├── daily.0/
        ├── daily.1/
        ├── ...
        ├── weekly.0/
        ├── weekly.1/
        ├── ...
        ├── monthly.0/
        ├── monthly.1/
        └── ...
    

    The /mnt/Backups/rsnapshot_backups/ directory currently occupies ~2.9 TB, with ~60M files and folders; simply traversing those contents takes time:

    ## As sudo (#), to avoid numerous "Permission denied" warnings:
    
    time find /mnt/Backups/rsnapshot_backups | wc -l
    60314138    ## 60.3M files, folders
    34:07.30    ## 34 min
    
    time du /mnt/Backups/rsnapshot_backups -d 0
    3112240160  /mnt/Backups/rsnapshot_backups    ## 3.1 TB
    33:51.88    ## 34 min
    
    time rsnapshot du    ## << more accurate re: rsnapshot footprint
    2.9T    /mnt/Backups/rsnapshot_backups/hourly.0/
    4.1G    /mnt/Backups/rsnapshot_backups/hourly.1/
    ...
    4.7G    /mnt/Backups/rsnapshot_backups/weekly.3/
    2.9T    total    ## 2.9 TB, per sudo rsnapshot du (more accurate)
    2:34:54          ## 2 hr 35 min
    

    Thus, anytime I need to search for a file on my / (root) partition, I need to deal with (avoid if possible) traversing my backups partition.


    EXAMPLES

    Among the approached variously suggested in this thread (How to exclude a directory in find . command), I find that searches using the accepted answer are much faster -- with caveats.

    Solution 1

    Let's say I want to find the system file libname-server-2.a, but I do not want to search through my rsnapshot backups. To quickly find a system file, use the exclude path /mnt (i.e., use /mnt, not /mnt/, or /mnt/Backups, or ...):

    ## As sudo (#), to avoid numerous "Permission denied" warnings:
    
    time find / -path /mnt -prune -o -name "*libname-server-2.a*" -print
    /usr/lib/libname-server-2.a
    real    0m8.644s              ## 8.6 sec  <<< NOTE!
    user    0m1.669s
     sys    0m2.466s
    
    ## As regular user (victoria); I also use an alternate timing mechanism, as
    ## here I am using 2>/dev/null to suppress "Permission denied" warnings:
    
    $ START="$(date +"%s")" && find 2>/dev/null / -path /mnt -prune -o \
        -name "*libname-server-2.a*" -print; END="$(date +"%s")"; \
        TIME="$((END - START))"; printf 'find command took %s sec\n' "$TIME"
    /usr/lib/libname-server-2.a
    find command took 3 sec     ## ~3 sec  <<< NOTE!
    

    ... finds that file in just a few seconds, while this take much longer (appearing to recurse through all of the "excluded" directories):

    ## As sudo (#), to avoid numerous "Permission denied" warnings:
    
    time find / -path /mnt/ -prune -o -name "*libname-server-2.a*" -print
    find: warning: -path /mnt/ will not match anything because it ends with /.
    /usr/lib/libname-server-2.a
    real    33m10.658s            ## 33 min 11 sec (~231-663x slower!)
    user    1m43.142s
     sys    2m22.666s
    
    ## As regular user (victoria); I also use an alternate timing mechanism, as
    ## here I am using 2>/dev/null to suppress "Permission denied" warnings:
    
    $ START="$(date +"%s")" && find 2>/dev/null / -path /mnt/ -prune -o \
        -name "*libname-server-2.a*" -print; END="$(date +"%s")"; \
        TIME="$((END - START))"; printf 'find command took %s sec\n' "$TIME"
    /usr/lib/libname-server-2.a
    find command took 1775 sec    ## 29.6 min
    

    Solution 2

    The other solution offered in this thread (SO#4210042) also performs poorly:

    ## As sudo (#), to avoid numerous "Permission denied" warnings:
    
    time find / -name "*libname-server-2.a*" -not -path "/mnt"
    /usr/lib/libname-server-2.a
    real    33m37.911s            ## 33 min 38 sec (~235x slower)
    user    1m45.134s
     sys    2m31.846s
    
    time find / -name "*libname-server-2.a*" -not -path "/mnt/*"
    /usr/lib/libname-server-2.a
    real    33m11.208s            ## 33 min 11 sec
    user    1m22.185s
     sys    2m29.962s
    

    SUMMARY | CONCLUSIONS

    Use the approach illustrated in "Solution 1"

    find / -path /mnt -prune -o -name "*libname-server-2.a*" -print
    

    i.e.

    ... -path <excluded_path> -prune -o ...
    

    noting that whenever you add the trailing / to the excluded path, the find command then recursively enters (all those) /mnt/* directories -- which in my case, because of the /mnt/Backups/rsnapshot_backups/* subdirectories, additionally includes ~2.9 TB of files to search! By not appending a trailing / the search should complete almost immediately (within seconds).

    "Solution 2" (... -not -path <exclude path> ...) likewise appears to recursively search through the excluded directories -- not returning excluded matches, but unnecessarily consuming that search time.


    Searching within those rsnapshot backups:

    To find a file in one of my hourly/daily/weekly/monthly rsnapshot backups):

    $ START="$(date +"%s")" && find 2>/dev/null /mnt/Backups/rsnapshot_backups/daily.0 -name '*04t8ugijrlkj.jpg'; END="$(date +"%s")"; TIME="$((END - START))"; printf 'find command took %s sec\n' "$TIME"
    /mnt/Backups/rsnapshot_backups/daily.0/snapshot_root/mnt/Vancouver/temp/04t8ugijrlkj.jpg
    find command took 312 sec   ## 5.2 minutes: despite apparent rsnapshot size
                                ## (~4 GB), it is in fact searching through ~2.9 TB)
    

    Excluding a nested directory:

    Here, I want to exclude a nested directory, e.g. /mnt/Vancouver/projects/ie/claws/data/* when searching from /mnt/Vancouver/projects/:

    $ time find . -iname '*test_file*'
    ./ie/claws/data/test_file
    ./ie/claws/test_file
    0:01.97
    
    $ time find . -path '*/data' -prune -o -iname '*test_file*' -print
    ./ie/claws/test_file
    0:00.07
    

    Aside: Adding -print at the end of the command suppresses the printout of the excluded directory:

    $ find / -path /mnt -prune -o -name "*libname-server-2.a*"
    /mnt
    /usr/lib/libname-server-2.a
    
    $ find / -path /mnt -prune -o -name "*libname-server-2.a*" -print
    /usr/lib/libname-server-2.a
    
    0 讨论(0)
  • 2020-11-22 04:09

    I was using find to provide a list of files for xgettext, and wanted to omit a specific directory and its contents. I tried many permutations of -path combined with -prune but was unable to fully exclude the directory which I wanted gone.

    Although I was able to ignore the contents of the directory which I wanted ignored, find then returned the directory itself as one of the results, which caused xgettext to crash as a result (doesn't accept directories; only files).

    My solution was to simply use grep -v to skip the directory that I didn't want in the results:

    find /project/directory -iname '*.php' -or -iname '*.phtml' | grep -iv '/some/directory' | xargs xgettext
    

    Whether or not there is an argument for find that will work 100%, I cannot say for certain. Using grep was a quick and easy solution after some headache.

    0 讨论(0)
  • 2020-11-22 04:09
    find . \( -path '.**/.git' -o -path '.**/.hg' \) -prune -o -name '*.js' -print
    

    The example above finds all *.js files under the current directory, excluding folders .git and .hg, does not matter how deep these .git and .hg folders are.

    Note: this also works:

    find . \( -path '.*/.git' -o -path '.*/.hg' \) -prune -o -name '*.js' -print
    

    but I prefer the ** notation for consistency with some other tools which would be off topic here.

    0 讨论(0)
  • 2020-11-22 04:10

    I find the following easier to reason about than other proposed solutions:

    find build -not \( -path build/external -prune \) -name \*.js
    # you can also exclude multiple paths
    find build -not \( -path build/external -prune \) -not \( -path build/blog -prune \) -name \*.js
    

    Important Note: the paths you type after -path must exactly match what find would print without the exclusion. If this sentence confuses you just make sure to use full paths through out the whole command like this: find /full/path/ -not \( -path /full/path/exclude/this -prune \) .... See note [1] if you'd like a better understanding.

    Inside \( and \) is an expression that will match exactly build/external (see important note above), and will, on success, avoid traversing anything below. This is then grouped as a single expression with the escaped parenthesis, and prefixed with -not which will make find skip anything that was matched by that expression.

    One might ask if adding -not will not make all other files hidden by -prune reappear, and the answer is no. The way -prune works is that anything that, once it is reached, the files below that directory are permanently ignored.

    This comes from an actual use case, where I needed to call yui-compressor on some files generated by wintersmith, but leave out other files that need to be sent as-is.


    Note [1]: If you want to exclude /tmp/foo/bar and you run find like this "find /tmp \(..." then you must specify -path /tmp/foo/bar. If on the other hand you run find like this cd /tmp; find . \(... then you must specify -path ./foo/bar.

    0 讨论(0)
  • 2020-11-22 04:12

    Use the -prune switch. For example, if you want to exclude the misc directory just add a -path ./misc -prune -o to your find command:

    find . -path ./misc -prune -false -o -name '*.txt'
    

    Here is an example with multiple directories:

    find . -type d \( -path dir1 -o -path dir2 -o -path dir3 \) -prune -false -o -name '*.txt'
    

    Here we exclude ./dir1, ./dir2 and ./dir3 in the current directory, since in find expressions it is an action that acts on the criteria -path dir1 -o -path dir2 -o -path dir3 (if dir1 or dir2 or dir3), ANDed with type -d.

    To exclude directory name at any level, use -name:

    find . -type d \( -name node_modules -o -name dir2 -o -path name \) -prune -false -o -name '*.json'
    
    0 讨论(0)
  • 2020-11-22 04:13

    There are plenty of good answers, it just took me some time to understand what each element of the command was for and the logic behind it.

    find . -path ./misc -prune -o -name '*.txt' -print
    

    find will start finding files and directories in the current directory, hence the find ..

    The -o option stands for a logical OR and separates the two parts of the command :

    [ -path ./misc -prune ] OR [ -name '*.txt' -print ]
    

    Any directory or file that is not the ./misc directory will not pass the first test -path ./misc. But they will be tested against the second expression. If their name corresponds to the pattern *.txt they get printed, because of the -print option.

    When find reaches the ./misc directory, this directory only satisfies the first expression. So the -prune option will be applied to it. It tells the find command to not explore that directory. So any file or directory in ./misc will not even be explored by find, will not be tested against the second part of the expression and will not be printed.

    0 讨论(0)
提交回复
热议问题