Use find command but exclude files in two directories

前端 未结 6 1069
刺人心
刺人心 2021-01-29 22:16

I want to find files that end with _peaks.bed, but exclude files in the tmp and scripts folders.

My command is like this:

相关标签:
6条回答
  • 2021-01-29 22:47

    Use

    find \( -path "./tmp" -o -path "./scripts" \) -prune -o  -name "*_peaks.bed" -print
    

    or

    find \( -path "./tmp" -o -path "./scripts" \) -prune -false -o  -name "*_peaks.bed"
    

    or

    find \( -path "./tmp" -path "./scripts" \) ! -prune -o  -name "*_peaks.bed"
    

    The order is important. It evaluates from left to right. Always begin with the path exclusion.

    Explanation

    Do not use -not (or !) to exclude whole directory. Use -prune. As explained in the manual:

    −prune    The primary shall always evaluate as  true;  it
              shall  cause  find  not  to descend the current
              pathname if it is a directory.  If  the  −depth
              primary  is specified, the −prune primary shall
              have no effect.
    

    and in the GNU find manual:

    -path pattern
                  [...]
                  To ignore  a  whole
                  directory  tree,  use  -prune rather than checking
                  every file in the tree.
    

    Indeed, if you use -not -path "./pathname", find will evaluate the expression for each node under "./pathname".

    find expressions are just condition evaluation.

    • \( \) - groups operation (you can use -path "./tmp" -prune -o -path "./scripts" -prune -o, but it is more verbose).
    • -path "./script" -prune - if -path returns true and is a directory, return true for that directory and do not descend into it.
    • -path "./script" ! -prune - it evaluates as (-path "./script") AND (! -prune). It revert the "always true" of prune to always false. It avoids printing "./script" as a match.
    • -path "./script" -prune -false - since -prune always returns true, you can follow it with -false to do the same than !.
    • -o - OR operator. If no operator is specified between two expressions, it defaults to AND operator.

    Hence, \( -path "./tmp" -o -path "./scripts" \) -prune -o -name "*_peaks.bed" -print is expanded to:

    [ (-path "./tmp" OR -path "./script") AND -prune ] OR ( -name "*_peaks.bed" AND print )
    

    The print is important here because without it is expanded to:

    { [ (-path "./tmp" OR -path "./script" )  AND -prune ]  OR (-name "*_peaks.bed" ) } AND print
    

    -print is added by find - that is why most of the time, you do not need to add it in you expression. And since -prune returns true, it will print "./script" and "./tmp".

    It is not necessary in the others because we switched -prune to always return false.

    Hint: You can use find -D opt expr 2>&1 1>/dev/null to see how it is optimized and expanded,
    find -D search expr 2>&1 1>/dev/null to see which path is checked.

    0 讨论(0)
  • 2021-01-29 22:52

    Here is one way you could do it...

    find . -type f -name "*_peaks.bed" | egrep -v "^(./tmp/|./scripts/)"
    
    0 讨论(0)
  • 2021-01-29 22:55

    Here's how you can specify that with find:

    find . -type f -name "*_peaks.bed" ! -path "./tmp/*" ! -path "./scripts/*"
    

    Explanation:

    • find . - Start find from current working directory (recursively by default)
    • -type f - Specify to find that you only want files in the results
    • -name "*_peaks.bed" - Look for files with the name ending in _peaks.bed
    • ! -path "./tmp/*" - Exclude all results whose path starts with ./tmp/
    • ! -path "./scripts/*" - Also exclude all results whose path starts with ./scripts/

    Testing the Solution:

    $ mkdir a b c d e
    $ touch a/1 b/2 c/3 d/4 e/5 e/a e/b
    $ find . -type f ! -path "./a/*" ! -path "./b/*"
    
    ./d/4
    ./c/3
    ./e/a
    ./e/b
    ./e/5
    

    You were pretty close, the -name option only considers the basename, where as -path considers the entire path =)

    0 讨论(0)
  • 2021-01-29 23:04

    for me, this solution didn't worked on a command exec with find, don't really know why, so my solution is

    find . -type f -path "./a/*" -prune -o -path "./b/*" -prune -o -exec gzip -f -v {} \;
    

    Explanation: same as sampson-chen one with the additions of

    -prune - ignore the proceding path of ...

    -o - Then if no match print the results, (prune the directories and print the remaining results)

    18:12 $ mkdir a b c d e
    18:13 $ touch a/1 b/2 c/3 d/4 e/5 e/a e/b
    18:13 $ find . -type f -path "./a/*" -prune -o -path "./b/*" -prune -o -exec gzip -f -v {} \;
    
    gzip: . is a directory -- ignored
    gzip: ./a is a directory -- ignored
    gzip: ./b is a directory -- ignored
    gzip: ./c is a directory -- ignored
    ./c/3:    0.0% -- replaced with ./c/3.gz
    gzip: ./d is a directory -- ignored
    ./d/4:    0.0% -- replaced with ./d/4.gz
    gzip: ./e is a directory -- ignored
    ./e/5:    0.0% -- replaced with ./e/5.gz
    ./e/a:    0.0% -- replaced with ./e/a.gz
    ./e/b:    0.0% -- replaced with ./e/b.gz
    
    0 讨论(0)
  • 2021-01-29 23:07

    Try something like

    find . \( -type f -name \*_peaks.bed -print \) -or \( -type d -and \( -name tmp -or -name scripts \) -and -prune \)
    

    and don't be too surprised if I got it a bit wrong. If the goal is an exec (instead of print), just substitute it in place.

    0 讨论(0)
  • 2021-01-29 23:09

    You can try below:

    find ./ ! \( -path ./tmp -prune \) ! \( -path ./scripts -prune \) -type f -name '*_peaks.bed'
    
    0 讨论(0)
提交回复
热议问题