"grep" offset of ascii string from binary file
Solution 1
You could use strings
for this:
strings -a -t x filename | grep foobar
Tested with GNU binutils.
For example, where in /bin/ls
does --help
occur:
strings -a -t x /bin/ls | grep -- --help
Output:
14938 Try `%s --help' for more information.
162f0 --help display this help and exit
Solution 2
grep --byte-offset --only-matching --text foobar filename
The --byte-offset
option prints the offset of each matching line.
The --only-matching
option makes it print offset for each matching instance instead of each matching line.
The --text
option makes grep treat the binary file as a text file.
You can shorten it to:
grep -oba foobar filename
It works in the GNU version of grep
, which comes with linux by default. It won't work in BSD grep (which comes with Mac by default).
Solution 3
I wanted to do the same task. Though strings | grep worked, I found gsar was the very tool I needed.
The output looks like:
>gsar.exe -bic -sfoobar filename.bin
filename.bin: 0x34b5: AAA foobar BBB
filename.bin: 0x56a0: foobar DDD
filename.bin: 2 matches found
mgilson
I used to be a fortran and sometimes C programmer, but these days I write mostly python and javascript. I am interested in computational physics and like to write code. I also used to be an avid gnuplot user and maybe someday I will be again... I am a currently a software engineer at Argo AI working to make the world's cars drive themselves. ~Matt
Updated on December 21, 2020Comments
-
mgilson over 3 years
I'm generating binary data files that are simply a series of records concatenated together. Each record consists of a (binary) header followed by binary data. Within the binary header is an ascii string 80 characters long. Somewhere along the way, my process of writing the files got a little messed up and I'm trying to debug this problem by inspecting how long each record actually is.
This seems extremely related, but I don't understand perl, so I haven't been able to get the accepted answer there to work. The other answer points to
bgrep
which I've compiled, but it wants me to feed it a hex string and I'd rather just have a tool where I can give it the ascii string and it will find it in the binary data, print the string and the byte offset where it was found.In other words, I'm looking for some tool which acts like this:
tool foobar filename
or
tool foobar < filename
and its output is something like this:
foobar:10 foobar:410 foobar:810 foobar:1210 ...
e.g. the string which matched and a byte offset in the file where the match started. In this example case, I can infer that each record is 400 bytes long.
Other constraints:
- ability to search by regex is cool, but I don't need it for this problem
- My binary files are big (3.5Gb), so I'd like to avoid reading the whole file into memory if possible.
-
mgilson over 11 yearsI tried this, all it says is:
Binary file filename matches
. My system is Ubuntu Linux, andgrep --version
gives: "GNU grep 2.5.2" -
Hari Menon over 11 yearsTry adding the
-a
option to treat binary files as text -
mgilson over 11 yearsI ended up using
strings -a -t d filename | grep foobar
to write the output in decimal instead of hex. Otherwise, great answer that seems like it will work with different flavors ofgrep
. -
Ivan X over 8 yearsIt could work in OS X grep if you prefix the grep with
LC_CTYPE=C
; however, recent (and maybe not so recent) OS X has grep 2.5.1, and that has a a bug in it which always outputs 0 as the byte offset. -
Hitechcomputergeek almost 8 yearsI'd suggest using
grep -F
if you just need to find a known string, as it has a lot less overhead. -
Luc over 5 years
grep -oba
(see Hari Menon's answer) is much faster, but usingstrings
allows you to do partial matching. Which answer is better depends on your use-case!