How to extract text portion of a binary file in linux/bash?
Solution 1
Use the strings
utility - that's exactly what it's designed for.
Solution 2
Here's what I used in a system that didn't have the "strings" utility installed
cat yourfilename | tr -cd "[:print:]"
This prints the text and removes unprintable characters in one fell swoop, unlike "cat -v filename" which prints only text but requires some postprocessing to remove unwanted stuff. Note that some of the binary data may be printable so you'll still get some gibberish between the good stuff. I think strings removes this gibberish too if you can use that.
Solution 3
If you're on a Debian distro, you can probably get radare2 (r2) with just sudo apt install radare2
.
After you've installed r2, either with apt
, some other installer on some other distro, or by following an online guide, you can use rabin2
to extract just the text part of a binary:
$ rabin2 -z your_binary
This is often "better" than just strings
because it outputs just the useful .data
section of the binary. Stuff outside that section isn't always very useful.
RonPringadi
Updated on June 29, 2022Comments
-
RonPringadi almost 2 years
I have a binary file. If I open it with vi, it shows sequences of human-readable text and binary characters. What is the best way to extract the human-readable portion only using bash?
I was thinking, maybe we can do this over a grep or sed pattern?
$ cat file1.bin | grep '????' > newfile.txt
-
RonPringadi almost 8 yearsI tried that before it didn't work. But then I realized i missed the s. strings not string , my bad :-) Thank you!
-
RonPringadi about 5 years
strings ~/Pictures/Pic_A.jpg
has result (or better).$ cat ~/Pictures/Pic_A.jpg | tr -cd "[:print:]"
Result:tr: Illegal byte sequence
-
Cliff over 3 yearsThis solution only works for executable files, as the tool is reverse engineering focused. Not every binary file is executable (thus, not having a .data section).
-
ChocolateOverflow over 3 yearsInteresting strings like passwords and paths hard-coded into binaries, as far as I know, are usually in the
.data
section so usingrabin2 -z
goes straight to those without printing the gibberish we get when usingstrings
. I do use both though. -
Cliff over 3 yearsMy comment was made to make it clear to readers the use case in which your tool works. Your use case is specific to executables, as you keep mentioning the
.data
section. The use case that landed me on this question was a nonexecutable binary file having no.data
section forrabin2
to operate on.strings
is useful on more than just executables. :-) -
Zimba over 3 yearsso how to use strings?
-
DevSolar almost 2 years@Zimba
man strings