I have a bunch of log files. I need to find out how many times a string occurs in all files.
grep -c string *
returns
...
f
Obligatory AWK solution:
grep -c string * | awk 'BEGIN{FS=":"}{x+=$2}END{print x}'
Take care if your file names include ":" though.
short recursive variant:
find . -type f -exec cat {} + | grep -c 'string'
Another oneliner using basic command line functions handling multiple occurences per line.
cat * |sed s/string/\\\nstring\ /g |grep string |wc -l
If you want number of occurrences per file (example for string "tcp"):
grep -RIci "tcp" . | awk -v FS=":" -v OFS="\t" '$2>0 { print $2, $1 }' | sort -hr
Example output:
53 ./HTTPClient/src/HTTPClient.cpp
21 ./WiFi/src/WiFiSTA.cpp
19 ./WiFi/src/ETH.cpp
13 ./WiFi/src/WiFiAP.cpp
4 ./WiFi/src/WiFiClient.cpp
4 ./HTTPClient/src/HTTPClient.h
3 ./WiFi/src/WiFiGeneric.cpp
2 ./WiFi/examples/WiFiClientBasic/WiFiClientBasic.ino
2 ./WiFiClientSecure/src/ssl_client.cpp
1 ./WiFi/src/WiFiServer.cpp
Explanation:
grep -RIci NEEDLE .
- looks for string NEEDLE recursively from current directory (following symlinks), ignoring binaries, counting number of occurrences, ignoring caseawk ...
- this command ignores files with zero occurrences and formats linessort -hr
- sorts lines in reverse order by numbers in first columnOf course, it works with other grep commands with option -c
(count) as well. For example:
grep -c "tcp" *.txt | awk -v FS=":" -v OFS="\t" '$2>0 { print $2, $1 }' | sort -hr
cat * | grep -c string
One of the rare useful applications of cat
.
The AWK solution which also handles file names including colons:
grep -c string * | sed -r 's/^.*://' | awk 'BEGIN{}{x+=$1}END{print x}'
Keep in mind that this method still does not find multiple occurrences of string
on the same line.