grep returns
Binary file test.log matches
For example
echo \"line1 re \\x00\\r\\nline2\\r\\nline3 re\\r\\n\" > test.log # in zsh
Here's what I used in a system that didn't have "strings" command installed
cat yourfilename | tr -cd "[:print:]"
This prints the text and removes unprintable characters in one fell swoop, unlike "cat -v filename" which requires some postprocessing to remove unwanted stuff. Note that some of the binary data may be printable so you'll still get some gibberish between the good stuff. I think strings removes this gibberish too if you can use that.
One way is to simply treat binary files as text anyway, with grep --text
but this may well result in binary information being sent to your terminal. That's not really a good idea if you're running a terminal that interprets the output stream (such as VT/DEC or many others).
Alternatively, you can send your file through tr
with the following command:
tr '[\000-\011\013-\037\177-\377]' '.' <test.log | grep whatever
This will change anything less than a space character (except newline) and anything greater than 126, into a .
character, leaving only the printables.
If you want every "illegal" character replaced by a different one, you can use something like the following C program, a classic standard input filter:
#include<stdio.h>
int main (void) {
int ch;
while ((ch = getchar()) != EOF) {
if ((ch == '\n') || ((ch >= ' ') && (ch <= '~'))) {
putchar (ch);
} else {
printf ("{{%02x}}", ch);
}
}
return 0;
}
This will give you {{NN}}
, where NN
is the hex code for the character. You can simply adjust the printf
for whatever style of output you want.
You can see that program in action here, where it:
pax$ printf 'Hello,\tBob\nGoodbye, Bob\n' | ./filterProg
Hello,{{09}}Bob
Goodbye, Bob
As James Selvakumar already said, grep -a
does the trick. -a or --text forces Grep to handle the inputstream as text.
See Manpage http://unixhelp.ed.ac.uk/CGI/man-cgi?grep
try
cat test.log | grep -a somestring
grep -a
It can't get simpler than that.
You can also try Word Extractor tool. Word Extractor can be used with any file in your computer to separate the strings that contain human text / words from binary code (exe applications, DLLs).
Starting with Grep 2.21, binary files are treated differently:
When searching binary data, grep now may treat non-text bytes as line terminators. This can boost performance significantly.
So what happens now is that with binary data, all non-text bytes (including newlines) are treated as line terminators. If you want to change this behavior, you can:
use --text
. This will ensure that only newlines are line terminators
use --null-data
. This will ensure that only null bytes are line terminators