Removing ANSI color codes from text stream

112,099

Solution 1

The characters ^[[37m and ^[[0m are part of the ANSI escape sequences (CSI codes).  See also these specifications.

Using GNU sed

sed -e 's/\x1b\[[0-9;]*m//g'
  • \x1b (or \x1B) is the escape special character
    (GNU sed does not support alternatives \e and \033)
  • \[ is the second character of the escape sequence
  • [0-9;]* is the color value(s) regex
  • m is the last character of the escape sequence

Using the macOS default sed

Mike suggests:

sed -e $'s/\x1b\[[0-9;]*m//g'

The macOS default sed does not support special characters like \e as pointed out by slm and steamer25 in the comments.

To install gsed.

brew install gnu-sed

Example with OP's command line

(OP means Original Poster)

perl -e 'use Term::ANSIColor; print color "white"; print "ABC\n"; print color "reset";' | 
      sed 's/\x1b\[[0-9;]*m//g'

Improvements

Flag -e is optional for GNU sed but required for the macOS default sed:

sed 's/\x1b\[[0-9;]*m//g'           # Remove color sequences only

Tom Hale suggests to also remove all other escape sequences using [a-zA-Z] instead of just the letter m specific to the graphics mode escape sequence (color):

sed 's/\x1b\[[0-9;]*[a-zA-Z]//g'    # Remove all escape sequences

But [a-zA-Z] may be too wide and could remove too much. Michał Faleński and Miguel Mota propose to remove only some escape sequences using [mGKH] and [mGKF] respectively.

sed 's/\x1b\[[0-9;]*[mGKH]//g'      # Remove color and move sequences
sed 's/\x1b\[[0-9;]*[mGKF]//g'      # Remove color and move sequences
sed 's/\x1b\[[0-9;]*[mGKHF]//g'     # Remove all
Last escape
sequence
character   Purpose
---------   -------------------------------
m           Graphics Rendition Mode (including color)
G           Horizontal cursor move
K           Horizontal deletion
H           New cursor position
F           Move cursor to previous n lines

Britton Kerin indicates K (in addition to m) removes the colors from gcc error/warning. Do not forget to redirect gcc 2>&1 | sed....

Using perl

The version of sed installed on some operating systems may be limited (e.g. macOS). The command perl has the advantage of being generally easier to install/update on more operating systems. Adam Katz suggests to use \e (same as \x1b) in PCRE.

Choose your regex depending on how much commands you want to filter:

perl -pe 's/\e\[[0-9;]*m//g'          # Remove colors only
perl -pe 's/\e\[[0-9;]*[mG]//g'
perl -pe 's/\e\[[0-9;]*[mGKH]//g'
perl -pe 's/\e\[[0-9;]*[a-zA-Z]//g'
perl -pe 's/\e\[[0-9;]*m(?:\e\[K)?//g' # Adam Katz's trick

Example with OP's command line:

perl -e 'use Term::ANSIColor; print color "white"; print "ABC\n"; print color "reset";' \
      | perl -pe 's/\e\[[0-9;]*m//g'

Usage

As pointed out by Stuart Cardall's comment, this sed command line is used by the project Ultimate Nginx Bad Bot (1000 stars) to clean up the email report ;-)

Solution 2

I have found out a better escape sequence remover if you're using MacOS. Check this:

perl -pe 's/\x1b\[[0-9;]*[mG]//g'

Solution 3

ansi2txt

https://unix.stackexchange.com/a/527259/116915

cat typescript | ansi2txt | col -b
  • ansi2txt: remove ANSI color codes
  • col -b: remove ^H or ^M


update: about col handle tabs and space //mentioned by @DanielF

〇. about col handle spaces and tabs

col -bx replace '\t' to ' ', col -bh replace ' ' to '\t'.

// seems col can't keep space/tabs as it is, it's a pity.


0. orig string

$ echo -e '        ff\tww' | hd
00000000  20 20 20 20 20 20 20 20  66 66 09 77 77 0a        |        ff.ww.|

1. -h repace spaces to tab

$ echo -e '        ff\tww' | col -b | hd
00000000  09 66 66 09 77 77 0a                              |.ff.ww.|
$ echo -e '        ff\tww' | col -bh | hd
00000000  09 66 66 09 77 77 0a                              |.ff.ww.|
$ echo -e '        ff\tww' | col -bxh | hd
00000000  09 66 66 09 77 77 0a                              |.ff.ww.|

2. -x repace tab to spaces

$ echo -e '        ff\tww' | col -bx | hd
00000000  20 20 20 20 20 20 20 20  66 66 20 20 20 20 20 20  |        ff      |
00000010  77 77 0a                                          |ww.|
$ echo -e '        ff\tww' | col -bhx | hd
00000000  20 20 20 20 20 20 20 20  66 66 20 20 20 20 20 20  |        ff      |
00000010  77 77 0a                                          |ww.|

3. seems col can't keep spaces and tabs as it is.

Solution 4

If you prefer something simple, you could use my strip-ansi-cli package (Node.js required):

$ npm install --global strip-ansi-cli

Then use it like this:

$ strip-ansi < colors.o

Or just pass in a string:

$ strip-ansi '^[[37mABC^[[0m'

Solution 5

What is displayed as ^[ is not ^ and [; it is the ASCII ESC character, produced by Esc or Ctrl[ (the ^ notation means the Ctrl key).

ESC is 0x1B hexadecimal or 033 octal, so you have to use \x1B or \033 in your regexes:

perl -pe 's/\033\[37m//g; s/\033[0m//g'

perl -pe 's/\033\[\d*(;\d*)*m//g'
Share:
112,099

Related videos on Youtube

user001
Author by

user001

Updated on September 18, 2022

Comments

  • user001
    user001 over 1 year

    Examining the output from

    perl -e 'use Term::ANSIColor; print color "white"; print "ABC\n"; print color "reset";'
    

    in a text editor (e.g., vi) shows the following:

    ^[[37mABC
    ^[[0m
    

    How would one remove the ANSI color codes from the output file? I suppose the best way would be to pipe the output through a stream editor of sorts.

    The following does not work

    perl -e 'use Term::ANSIColor; print color "white"; print "ABC\n"; print color "reset";' | perl -pe 's/\^\[\[37m//g' | perl -pe 's/\^\[\[0m//g'
    
    • terdon
      terdon almost 11 years
      Not an answer to the question, but you can also pipe the output to more or less -R which can interpret the escape codes as color instead of a text editor.
  • Redsandro
    Redsandro about 11 years
    Thanks for the sed command and the explanation. :)
  • Redsandro
    Redsandro about 10 years
    Some color codes (e.g. Linux terminal) contain a prefix, e.g. 1;31m so better add ; to your regex: cat colored.log | sed -r 's/\x1b\[[0-9;]*m//g' or they won't be stripped.
  • Scott - Слава Україні
    Scott - Слава Україні about 8 years
    (1) What do you mean by The "answered" question?  Do you mean the accepted answer?  (2) This command does not work — it does not even execute — because it has an unmatched (unbalanced) quote.  (3) This a useless use of cat (UUOC) — it should be possible to do perl -pe command colors.o.  (4) Who ever said anything about the codes being in a .o file?
  • Scott - Слава Україні
    Scott - Слава Україні about 8 years
    This a useless use of cat (UUOC) — it should be possible to do perl -pe command putty1.log.
  • Scott - Слава Україні
    Scott - Слава Україні about 8 years
    This a useless use of cat (UUOC) — it should be possible to do strip-ansi colors.o or at least strip-ansi < colors.o.
  • Sindre Sorhus
    Sindre Sorhus about 8 years
    @Scott Sure, you can also do strip-ansi < colors.o, but from experience people are more familiar with piping. I've updated the answer.
  • Blaisorblade
    Blaisorblade over 7 years
    What's the improvement from the accepted answer (superuser.com/a/380778/46794)?
  • BVengerov
    BVengerov about 7 years
    @Blaisorblade It works on OS X, whereas sed -r does NOT.
  • Stuart Cardall
    Stuart Cardall almost 7 years
    this is great used it in github.com/mitchellkrogza/nginx-ultimate-bad-bot-blocker/blo‌​b/… to clean up the email report.
  • CMCDragonkai
    CMCDragonkai over 5 years
    In your perl example you have a command that filters out the colours. But what are the other commands? They additionally filter out the mG and mGKH and then a-zA-Z, can you add a comment next to each one?
  • Adam Katz
    Adam Katz over 5 years
    Relevant to when you're not just removing the codes but rather observing them: grep appends \x1b[K (erase to end of line) to all color codes, so I prefer the perl/PCRE regex \e\[[0-9;]*m(?:\e\[K)? (in perl/PCRE but not sed, \e is the same as \x1b)
  • slm
    slm about 5 years
    Keep in mind that the OSX version of sed didn't work w/ the example shown, the gsed version however does.
  • steamer25
    steamer25 almost 5 years
    More context for slm's comment about OSX sed: it doesn't support control characters like \x1b. E.g., stackoverflow.com/a/14881851/93345 . You can get the gsed command via brew install gnu-sed .
  • Penghe Geng
    Penghe Geng almost 5 years
    good simple solution
  • TxAG98
    TxAG98 almost 5 years
    Thanks for this... this worked for me to get rid of that tput sgr0 that the other solutions never seem to be able to get rid of.
  • oHo
    oHo over 4 years
    Thank you @AdamKatz for your comment. 👍 I have just edited the answer. Is it OK for you? Have fun
  • Adam Katz
    Adam Katz over 4 years
    Sure. See also my answer below for more detail and instructions to remove every escape sequence (and, optionally, some other non-printing sequences) rather than just colors. Another note: I'm not sure I've seen a version of sed that accepts \x1b but not \033
  • wchargin
    wchargin over 4 years
    You should disclose that you are the author of this package, in accordance with Super User policy.
  • Sindre Sorhus
    Sindre Sorhus over 4 years
    @wchargin Done.
  • Mike
    Mike about 4 years
    On mac sed -e $'s/\x1b\[[0-9;]*m//g' works without gsed @slm @steamer25
  • Alejandro Teixeira Muñoz
    Alejandro Teixeira Muñoz about 4 years
    (OP means Original Poster) <--- + 1 !!!! hahahah I'm a >2K and I'm still wondering wth it was. I always thougth it was "original petition" Thanks @olibre !!!
  • André Werlang
    André Werlang about 4 years
    If you prefer something simple ... goes on to propose installing an entire platform, a tool which brings dozens of unverified dependencies... 'simple' really mean different things these days... c'mom
  • Kevin
    Kevin almost 3 years
    Very nice. This is actually the only one in this thread that successfully parsed a raw terminal log generated from sudossh2 without leaving any residual/partial sequences that seem to common in PS1 bash prompts, etc.
  • Kevin
    Kevin almost 3 years
    Just to clarify, the only thing actually changed here from the referenced submission is replacing the shorthand \e which python's re module doesn't seen to know about, with the long form \xb1.
  • Aubin
    Aubin almost 3 years
    sudo apt install colorized-logs
  • Daniel F
    Daniel F over 2 years
    col -bx if you need to prevent spaces getting replaced by tabs.
  • yurenchen
    yurenchen over 2 years
    @DanielF -x replace '\t' to ' ', -h replace ' ' to '\t'. (seems col can't keep space/tabs as it is, it's a pity
  • Admin
    Admin almost 2 years
    brew install ansifilter for the lazy on macOS