How to check if the file is a binary file and read all the files which are not?

后端 未结 13 863
走了就别回头了
走了就别回头了 2020-12-05 16:58

How can I know if a file is a binary file?

For example, compiled c file.

I want to read all files from some directory, but I want ignore binary files.

相关标签:
13条回答
  • 2020-12-05 17:25

    Perhaps this would suffice ..

    if ! file /path/to/file | grep -iq ASCII ; then
        echo "Binary"
    fi
    
    if file /path/to/file | grep -iq ASCII ; then
        echo "Text file"
    fi
    
    0 讨论(0)
  • 2020-12-05 17:26

    It's kind of brute force to exclude binary files with tr -d "[[:print:]\n\t]" < file | wc -c, but it is no heuristic guesswork either.

    find . -type f -maxdepth 1 -exec /bin/sh -c '
       for file in "$@"; do
          if [ $(LC_ALL=C LANG=C tr -d "[[:print:]\n\t]" < "$file" | wc -c) -gt 0 ]; then
             echo "${file} is no ASCII text file (UNIX)"
          else
             echo "${file} is ASCII text file (UNIX)"
          fi
       done
    ' _ '{}' +
    

    The following brute-force approach using grep -a -m 1 $'[^[:print:]\t]' file seems quite a bit faster, though.

    find . -type f -maxdepth 1 -exec /bin/sh -c '
       tab="$(printf "\t")"
       for file in "$@"; do
          if LC_ALL=C LANG=C grep -a -m 1 "[^[:print:]${tab}]" "$file" 1>/dev/null 2>&1; then
             echo "${file} is no ASCII text file (UNIX)"
          else
             echo "${file} is ASCII text file (UNIX)"
          fi
       done
    ' _ '{}' + 
    
    0 讨论(0)
  • 2020-12-05 17:27

    Adapted from excluding binary file

    find . -exec file {} \; | grep text | cut -d: -f1
    
    0 讨论(0)
  • 2020-12-05 17:33

    BSD grep

    Here is a simple solution to check for a single file using BSD grep (on macOS/Unix):

    grep -q "\x00" file && echo Binary || echo Text
    

    which basically checks if file consist NUL character.

    Using this method, to read all non-binary files recursively using find utility you can do:

    find . -type f -exec sh -c 'grep -q "\x00" {} || cat {}' ";"
    

    Or even simpler using just grep:

    grep -rv "\x00" .
    

    For just current folder, use:

    grep -v "\x00" *
    

    Unfortunately the above examples won't work for GNU grep, however there is a workaround.

    GNU grep

    Since GNU grep is ignoring NULL characters, it's possible to check for other non-ASCII characters like:

    $ grep -P "[^\x00-\x7F]" file && echo Binary || echo Text
    

    Note: It won't work for files containing only NULL characters.

    0 讨论(0)
  • 2020-12-05 17:33

    cat+grep

    Assuming binary means the file containing NULL characters, this shell command can help:

    (cat -v file.bin | grep -q "\^@") && echo Binary || echo Text
    

    or:

    grep -q "\^@" <(cat -v file.bin) && echo Binary
    

    This is the workaround for grep -q "\x00", which works for BSD grep, but not for GNU version.

    Basically -v for cat converts all non-printing characters so they are visible in form of control characters, for example:

    $ printf "\x00\x00" | hexdump -C
    00000000  00 00                                             |..|
    $ printf "\x00\x00" | cat -v
    ^@^@
    $ printf "\x00\x00" | cat -v | hexdump -C
    00000000  5e 40 5e 40                                       |^@^@|
    

    where ^@ characters represent NULL character. So once these control characters are found, we assume the file is binary.


    The disadvantage of above method is that it could generate false positives when characters are not representing control characters. For example:

    $ printf "\x00\x00^@^@" | cat -v | hexdump -C
    00000000  5e 40 5e 40 5e 40 5e 40                           |^@^@^@^@|
    

    See also: How do I grep for all non-ASCII characters.

    0 讨论(0)
  • 2020-12-05 17:35
    perl -E 'exit((-B $ARGV[0])?0:1);' file-to-test
    

    Could be used to check whenever "file-to-test" is binary. The above command will exit wit code 0 on binary files, otherwise the exit code would be 1.

    The reverse check for text file can look like the following command:

    perl -E 'exit((-T $ARGV[0])?0:1);' file-to-test
    

    Likewise the above command will exit with status 0 if the "file-to-test" is text (not binary).

    Read more about the -B and -T checks using command perldoc -f -X.

    0 讨论(0)
提交回复
热议问题