I\'m trying to run a find
command for all JavaScript files, but how do I exclude a specific directory?
Here is the find
code we\'re using.<
There is clearly some confusion here as to what the preferred syntax for skipping a directory should be.
GNU Opinion
To ignore a directory and the files under it, use -prune
From the GNU find man page
Reasoning
-prune
stops find
from descending into a directory. Just specifying -not -path
will still descend into the skipped directory, but -not -path
will be false whenever find
tests each file.
Issues with -prune
-prune
does what it's intended to, but are still some things you have to take care of when using it.
find
prints the pruned directory.
-prune
only works with -print
and no other actions.
-prune
works with any action except -delete
. Why doesn't it work with delete? For -delete
to work, find needs to traverse the directory in DFS order, since -delete
will first delete the leaves, then the parents of the leaves, etc... But for specifying -prune
to make sense, find
needs to hit a directory and stop descending it, which clearly makes no sense with -depth
or -delete
on.Performance
I set up a simple test of the three top upvoted answers on this question (replaced -print
with -exec bash -c 'echo $0' {} \;
to show another action example). Results are below
----------------------------------------------
# of files/dirs in level one directories
.performance_test/prune_me 702702
.performance_test/other 2
----------------------------------------------
> find ".performance_test" -path ".performance_test/prune_me" -prune -o -exec bash -c 'echo "$0"' {} \;
.performance_test
.performance_test/other
.performance_test/other/foo
[# of files] 3 [Runtime(ns)] 23513814
> find ".performance_test" -not \( -path ".performance_test/prune_me" -prune \) -exec bash -c 'echo "$0"' {} \;
.performance_test
.performance_test/other
.performance_test/other/foo
[# of files] 3 [Runtime(ns)] 10670141
> find ".performance_test" -not -path ".performance_test/prune_me*" -exec bash -c 'echo "$0"' {} \;
.performance_test
.performance_test/other
.performance_test/other/foo
[# of files] 3 [Runtime(ns)] 864843145
Conclusion
Both f10bit's syntax and Daniel C. Sobral's syntax took 10-25ms to run on average. GetFree's syntax, which doesn't use -prune
, took 865ms. So, yes this is a rather extreme example, but if you care about run time and are doing anything remotely intensive you should use -prune
.
Note Daniel C. Sobral's syntax performed the better of the two -prune
syntaxes; but, I strongly suspect this is the result of some caching as switching the order in which the two ran resulted in the opposite result, while the non-prune version was always slowest.
Test Script
#!/bin/bash
dir='.performance_test'
setup() {
mkdir "$dir" || exit 1
mkdir -p "$dir/prune_me/a/b/c/d/e/f/g/h/i/j/k/l/m/n/o/p/q/r/s/t/u/w/x/y/z" \
"$dir/other"
find "$dir/prune_me" -depth -type d -exec mkdir '{}'/{A..Z} \;
find "$dir/prune_me" -type d -exec touch '{}'/{1..1000} \;
touch "$dir/other/foo"
}
cleanup() {
rm -rf "$dir"
}
stats() {
for file in "$dir"/*; do
if [[ -d "$file" ]]; then
count=$(find "$file" | wc -l)
printf "%-30s %-10s\n" "$file" "$count"
fi
done
}
name1() {
find "$dir" -path "$dir/prune_me" -prune -o -exec bash -c 'echo "$0"' {} \;
}
name2() {
find "$dir" -not \( -path "$dir/prune_me" -prune \) -exec bash -c 'echo "$0"' {} \;
}
name3() {
find "$dir" -not -path "$dir/prune_me*" -exec bash -c 'echo "$0"' {} \;
}
printf "Setting up test files...\n\n"
setup
echo "----------------------------------------------"
echo "# of files/dirs in level one directories"
stats | sort -k 2 -n -r
echo "----------------------------------------------"
printf "\nRunning performance test...\n\n"
echo \> find \""$dir"\" -path \""$dir/prune_me"\" -prune -o -exec bash -c \'echo \"\$0\"\' {} \\\;
name1
s=$(date +%s%N)
name1_num=$(name1 | wc -l)
e=$(date +%s%N)
name1_perf=$((e-s))
printf " [# of files] $name1_num [Runtime(ns)] $name1_perf\n\n"
echo \> find \""$dir"\" -not \\\( -path \""$dir/prune_me"\" -prune \\\) -exec bash -c \'echo \"\$0\"\' {} \\\;
name2
s=$(date +%s%N)
name2_num=$(name2 | wc -l)
e=$(date +%s%N)
name2_perf=$((e-s))
printf " [# of files] $name2_num [Runtime(ns)] $name2_perf\n\n"
echo \> find \""$dir"\" -not -path \""$dir/prune_me*"\" -exec bash -c \'echo \"\$0\"\' {} \\\;
name3
s=$(date +%s%N)
name3_num=$(name3 | wc -l)
e=$(date +%s%N)
name3_perf=$((e-s))
printf " [# of files] $name3_num [Runtime(ns)] $name3_perf\n\n"
echo "Cleaning up test files..."
cleanup