How to check if the file is a binary file and read all the files which are not?

66,905

Solution 1

Use utility file, sample usage:

 $ file /bin/bash
 /bin/bash: Mach-O universal binary with 2 architectures
 /bin/bash (for architecture x86_64):   Mach-O 64-bit executable x86_64
 /bin/bash (for architecture i386): Mach-O executable i386

 $ file /etc/passwd
 /etc/passwd: ASCII English text

 $ file code.c
 code.c: ASCII c program text

file manual page

Solution 2

Adapted from excluding binary file

find . -exec file {} \; | grep text | cut -d: -f1

Solution 3

I use

! grep -qI . $path

Only drawback I can see is that it will consider an empty file binary but then again, who decides if that is wrong?

Solution 4

BSD grep

Here is a simple solution to check for a single file using BSD grep (on macOS/Unix):

grep -q "\x00" file && echo Binary || echo Text

which basically checks if file consist NUL character.

Using this method, to read all non-binary files recursively using find utility you can do:

find . -type f -exec sh -c 'grep -q "\x00" {} || cat {}' ";"

Or even simpler using just grep:

grep -rv "\x00" .

For just current folder, use:

grep -v "\x00" *

Unfortunately the above examples won't work for GNU grep, however there is a workaround.

GNU grep

Since GNU grep is ignoring NULL characters, it's possible to check for other non-ASCII characters like:

$ grep -P "[^\x00-\x7F]" file && echo Binary || echo Text

Note: It won't work for files containing only NULL characters.

Solution 5

perl -E 'exit((-B $ARGV[0])?0:1);' file-to-test

Could be used to check whenever "file-to-test" is binary. The above command will exit wit code 0 on binary files, otherwise the exit code would be 1.

The reverse check for text file can look like the following command:

perl -E 'exit((-T $ARGV[0])?0:1);' file-to-test

Likewise the above command will exit with status 0 if the "file-to-test" is text (not binary).

Read more about the -B and -T checks using command perldoc -f -X.

Share:
66,905
Refael
Author by

Refael

Updated on July 08, 2022

Comments

  • Refael
    Refael almost 2 years

    How can I know if a file is a binary file?

    For example, compiled c file.

    I want to read all files from some directory, but I want ignore binary files.