There must be a better / shorter way to do this:
# Find files that contain in current directory
# (including sub directories)
$ find .
Pretty sure that's the only way. You'll have to reiderate through each folder, then through each subfolder and check each file.
Only other thing i can think of is in server code throw the directory and file structure into a LINQ query and then you can do a sql-like query against it. but then the server is going to end up doing pretty much the same thing.
find . -name \*.html
or, if you want to find files with names matching a regular expression:
find . -regex filename-regex.\*\.html
or, if you want to search for a regular expression in files with names matching a regular expression
find . -regex filename-regex.\*\.html -exec grep -H string-to-find {} \;
The grep
argument -H
outputs the name of the file, if that's of interest. If not, you can safely remove it and simply use grep
. This will instruct find
to execute grep string-to-find filename
for each file name it finds, thus avoiding the possibility of the list of arguments being too long, and the need for find
to finish executing before it can pass its results to xargs
.
To address your examples:
find . | xargs grep <string-to-find>
could be replaced with
find . -exec grep -H string-to-find {} \;
and
find . | grep html$ | xargs grep <string-to-find>
could be replaced with
find . -name \*.html -exec grep -H string-to-find {} \;
If this is going to be a common search utility you're going to utilize, you may want to take a look at ack, which combines both the find
and the grep
together into this functionality that you're looking for. It has fewer features than grep
, though 99% of my searches are suited perfectly by replacing all instances of grep
with ack
.
Besides the other answers given, I also suggest this construct:
Even better, if the filenames have spaces in them, you can either quote
find . -type f -name "*.html" -print|xargs -I FILENAME grep "< string-to-find>" FILENAME
"FILENAME"
or pass a null-terminated (instead of newline-terminated) result from find
to xargs
, and then have xargs
strip those out itself:
find . -type f -name "*.html" -print0|xargs -0 -I FILENAME grep "< string-to-find>" FILENAME
here --^ and --^
Here, the name FILENAME
can actually be anything, but it needs to match both
Like this:
find . -type f -name "*.html" -print0|xargs -0 -I FILENAME grep "< string-to-find>" FILENAME
here --^ and --^
find . -type f -name "*.html" -print0|xargs -0 -I GRRRR grep "< string-to-find>" GRRR
this --^ this --^
It's essentially doing the same thing as the {}
used within the find
statement itself to state "the line of text that this returned". Otherwise, xargs just tacks the results of find
to the END of all the rest of the commands you give it (which doesn't help much if you want grep
to search inside a file, which is usually specified last on the command-line).
Not sure what do you mean by better, my first thought was something like this:
grep <string-to-find> $(find -regex .*\.html)
But that's worse because result of the find would be accumulated somewhere in shells memory and then sent as a huge chunk of input arguments
The only imporvement I see too your suggestion is
find -regex .*\.html | xargs grep <string-to-find>
That way find performs all the filtering and you still retain piped processing