How do I grep through binary files that look like text?

254,938

Solution 1

You can use grep anyway to search through the file - it does not really care if the input file is really text or not. From 'man grep':

    -a, --text
          Process a binary file as if it were text; this is equivalent to the --binary-files=text option.

   --binary-files=TYPE
          If  the  first few bytes of a file indicate that the file contains binary data, assume that the file is
          of type TYPE.  By default, TYPE is binary, and grep normally outputs either a one-line  message  saying
          that a binary file matches, or no message if there is no match.  If TYPE is without-match, grep assumes
          that a binary file does not match; this is equivalent  to  the  -I  option.   If  TYPE  is  text,  grep
          processes  a  binary  file  as  if  it  were  text; this is equivalent to the -a option.  Warning: grep
          --binary-files=text might output binary garbage, which can have nasty side effects if the output  is  a
          terminal and if the terminal driver interprets some of it as commands.

Please mark the words of caution at the end of the second paragraph. You might want to redirect the results from grep into a new file and examine this with vi / less.

Solution 2

Pipe it through strings, which will strip out all of the binary code leaving just the text.

Solution 3

Give bgrep a try. (original release / more recent fork)

Solution 4

You can use these three commands:

  1. grep -a <sth> file.txt

  2. cat -v file.txt | grep <sth>

  3. cat file.txt | tr '[\000-\011\013-\037\177-\377]' '.' | grep <sth>

Solution 5

Starting with Grep 2.21, binary files are treated differently:

When searching binary data, grep now may treat non-text bytes as line terminators. This can boost performance significantly.

So what happens now is that with binary data, all non-text bytes (including newlines) are treated as line terminators. If you want to change this behavior, you can:

  • use --text. This will ensure that only newlines are line terminators

  • use --null-data. This will ensure that only null bytes are line terminators

Share:
254,938

Related videos on Youtube

Robyn Smith
Author by

Robyn Smith

Computer Science student at UoG with a minor in psych. Interests in technology, writing, online privacy, philosophy, law, music, movies, books, tv and more :)

Updated on September 18, 2022

Comments

  • Robyn Smith
    Robyn Smith over 1 year

    I have binary files that should be text (they're exported logs), but I can't open it with less (it looks ugly - it looks like a binary file). I found that I could open it with vi and I can cat it (you'll see the actual logs), but what I'd really like to do is grep through them (without having to open up each one with vi and then perform a search). Is there a way for me to do that?

  • user55570
    user55570 almost 9 years
    the tr does not seem to work on my solaris 10 box. Simple test: echo -e 'x\ty' | tr '[\000-\011\013-\037\177-\377]' '.' does not translate the tab.
  • Léo Léopold Hertz 준영
    Léo Léopold Hertz 준영 almost 9 years
    I think this is the best answer here. It is so annoying to see bad implementations of binary search like here commandlinefu.com/commands/matching/grep-binary/… where the escaping by \x does not really work like here grep -P "\x05\x00\xc0" mybinaryfile.
  • Léo Léopold Hertz 준영
    Léo Léopold Hertz 준영 almost 9 years
    I run bgrep "fafafafa" test_27.6.2015.bin |less but get test_27.6.2015.bin: 00005ee4. I would assume get fafafafa, since I was searching this. No manual in man. Any idea why such an output?
  • Léo Léopold Hertz 준영
    Léo Léopold Hertz 준영 almost 9 years
    I opened a new thread about the functioning of bgrep here stackoverflow.com/q/31135561/54964
  • Admin
    Admin about 7 years
    Unfortunately, bash: bgrep: command not found... and No package bgrep available.
  • Javier
    Javier over 6 years
    strings apparently does not understand utf-8 is text.