Suppose I have a file input.txt
with few columns and few rows, the first column is the key, and a directory dir
with files which contain some of these
grep requires parameters in order: [what to search] [where to search]. You need to merge keys received from awk and pass them to grep using the \| regexp operator. For example:
arturcz@szczaw:/tmp/s$ cat words.txt
foo
bar
fubar
foobaz
arturcz@szczaw:/tmp/s$ grep 'foo\|baz' words.txt
foo
foobaz
Finally, you will finish with:
grep `commands|to|prepare|a|keywords|list` directory
First thing you should do is research this.
Next ... you don't need to grep inside awk. That's completely redundant. It's like ... stuffing your turkey with .. a turkey.
Awk can process input and do "grep" like things itself, without the need to launch the grep command. But you don't even need to do this. Adapting your first example:
awk '{print $1}' input.txt | xargs -n 1 -I % grep % dir
This uses xargs' -I
option to put xargs' input into a different place on the command line it runs. In FreeBSD or OSX, you would use a -J
option instead.
But I prefer your for loop idea, converted into a while loop:
while read key junk; do grep -rn "$key" dir ; done < input.txt
Try following
awk '{print $1}' input.txt | xargs -n 1 -I pattern grep -rn pattern dir
In case you still want to use grep inside awk, make sure $1, $2 etc are outside quote. eg. this works perfectly
cat file_having_query | awk '{system("grep " $1 " file_to_be_greped")}'
// notice the space after grep and before file name
You don't need grep
with awk
, and you don't need cat
to open files:
awk 'NR==FNR{keys[$1]; next} {for (key in keys) if ($0 ~ key) {print FILENAME, $0; next} }' input.txt dir/*
Nor do you need xargs, or shell loops or anything else - just one simple awk command does it all.
If input.txt is not a file, then tweak the above to:
real_input_generating_command |
awk 'NR==FNR{keys[$1]; next} {for (key in keys) if ($0 ~ key) {print FILENAME, $0; next} }' - dir/*
All it's doing is creating an array of keys from the first file (or input stream) and then looking for each key from that array in every file in the dir directory.
Use process substitution to create a keyword "file" that you can pass to grep
via the -f
option:
grep -f <(awk '{print $1}' input.txt) dir/*
This will search each file in dir
for lines containing keywords printed by the awk
command. It's equivalent to
awk '{print $1}' input.txt > tmp.txt
grep -f tmp.txt dir/*