Remove ^M character from log files

37,180

Solution 1

Converting a standalone file

If you run the following command:

$ dos2unix <file>

The <file> will have all the ^M characters stripped. If you want to leave <file> intact, then simply run dos2unix like this:

$ dos2unix -n <file> <newfile>

Parsing output from a command

If you need to do them as part of a chain of commands via a pipe, you can use any number of tools such as tr, sed, awk, or perl to do this.

tr

$ java -jar test.jar | tr -d '^M' >> test.log

sed

$ java -jar test.jar | sed 's/^M//g' >> test.log

awk

$ java -jar test.jar | awk 'sub(/^M/,"")' >> test.log

perl

$ java -jar test.jar | perl -p -e 's/^M//g' >> test.log

Typing ^M

When entering the ^M be sure to enter it in one of the following ways:

  1. As Control + v + M and not Shift + 6 + M.
  2. As a backslash r, i.e. (\r).
  3. As an octal number (\015).
  4. As a hexidecimal number (\x0D).

Why is this necessary?

The ^M is part of how end of lines are terminated on the Windows platform. Each end of line is terminated with a carriage return character followed by a linefeed character.

On Unix systems the end of line is terminated by just a linefeed character.

  • linefeed character = 0x0A in hex, also written as \n.
  • carriage return character = 0x0D in hex, also written as \r.

Examples

You can see these if you pipe the output to a tool such as od or hexdump. Here's a sample file with the line terminating carriage returns + linefeed characters.

$ cat sample.txt
hi there
bye there

You can see them with hexdump as \r + \n:

$ hexdump -c sample.txt 
0000000   h   i       t   h   e   r   e  \r  \n   b   y   e       t   h
0000010   e   r   e  \r  \n                                            
0000015

Or as their hexidecimal 0d + 0a:

$ hexdump -C sample.txt 
00000000  68 69 20 74 68 65 72 65  0d 0a 62 79 65 20 74 68  |hi there..bye th|
00000010  65 72 65 0d 0a                                    |ere..|
00000015

Running this through sed 's/\r//g':

$ sed 's/\r//g' sample.txt |hexdump -C
00000000  68 69 20 74 68 65 72 65  0a 62 79 65 20 74 68 65  |hi there.bye the|
00000010  72 65 0a                                          |re.|
00000013

You can see that sed has removed the 0d character.

Viewing files with ^M without converting?

Yes you can use vim to do this. You can either set the fileformat setting in vim, which will have the effect of converting the file like we were doing above, or you can change the fileformat in the vim view.

changing a file's format

:set fileformat=dos
:set fileformat=unix

You can use the shorthand notation too:

:set ff=dos
:set ff=unix

Alternatively you can just change the fileformat of the view. This approach is nondestructive:

:e ++ff=dos
:e ++ff=unix

Here you can see me opening our ^M file, sample.txt in vim:

           ss of vim dos #1

Now I'm converting the fileformat in the view:

           ss of vim dos #2

Here's what it looks like when converted to the unix fileformat:

           ss of vim dos #3

References

Solution 2

Shove the file through dos2unix to fix the line endings.

Or, use one of these:

sed 's,\r$,,'
tr -d '\r'

Solution 3

You need to fix your program to call isatty() and if stdout is not a tty, then do not output the ^M.

Share:
37,180

Related videos on Youtube

Ram
Author by

Ram

Updated on September 18, 2022

Comments

  • Ram
    Ram almost 2 years

    Remove ^M character from log files.

    In my script I redirect output of my program to a log file. The output of my log file contains some ^M (newline) characters. I need to remove them while running itself.

    My command:

    $ java -jar test.jar >> test.log 
    

    test.log has:

    Starting script ... ^M Starting script ...Initializing

    • ash
      ash almost 11 years
      And ^M is carriage return (CR) which, on most terminals, returns to the first column of the current line - so like a soft backspace over the length of the line.
  • Ram
    Ram almost 11 years
    Thanks... I need to do an extra step after my command (java -jar test.jar >> test.log ) is it possible to ignore (^M) character while redirecting output itself ... ??
  • Ram
    Ram almost 11 years
    I used java -jar test.jar | sed 's/\r//g' >> test.log --- working great
  • Ram
    Ram almost 11 years
    Thanks ... I used java -jar test.jar | sed 's/\r//g' >> test.log --- working great
  • slm
    slm almost 11 years
    @Ram - glad it solved your problem.
  • Ram
    Ram almost 11 years
    @sim The user tries to see the log file using vi test.log and cat -v test.log .. They are treating it as error so i am trying to hide that char .
  • Ram
    Ram almost 11 years
    @sim User is using only vi editor . They are not ready to set any settings in editor. They want code itself to handle it .
  • slm
    slm almost 11 years
    @Ram - understood, just letting you know, I've added how to the answer if you or they are curious.
  • Ram
    Ram almost 11 years
  • psusi
    psusi almost 11 years
    This does NOT answer the question because the OP does not have a file with DOS line endings. He has terminal output that uses the ^M to go back and modify the previous output on the line.
  • slm
    slm almost 11 years
    @psusi - did you even read this entire Q&A thread? I've talked with the OP and he's marked this as the accepted answer b/c it DOES fix his issue. He's redirecting the terminal output to a file that others are then viewing in vi. Please don't post comments on things you haven't read!
  • slm
    slm almost 11 years
    This is your answer? Please re-read the question. The title even says "remove ... from log files".
  • psusi
    psusi almost 11 years
    He said that sed worked, not dos2unix. dos2unix looks for CR + NL and replaces them with just NL. He doesn't have CR + NL. Using sed to remove the CR also leaves you with both the before and after text concatenated on one line, so it won't look right compared to what you see on a terminal.
  • psusi
    psusi almost 11 years
    @slm, yes... they shouldn't be in the log file in the first place. Well written programs check that they are actually using a terminal before spitting out terminal control codes.
  • slm
    slm almost 11 years
    Let's not debate "well written" programs. Sometimes people don't have access to the software that they support and have to do things like this to get the job done. Perhaps instead of this answer, you could explain how one would do this from Java instead. Seems like a gap on this Q&A that you could direct your energies to rather than coming in, downvoting the accepted answer, and saying that it DOESN'T work when clearly it DOES.
  • slm
    slm almost 11 years
    Also in larger environments I've worked I've seen exactly this type of issue come up. It would be too costly to go back and "fix" a program that is doing this, rather it's more cost effect to do something like this. I'd bet money that this software was developed on a PC and is now deployed on a Unix box, and the original developers would be the ones that don't understand testing the type of stdout, and then adjusting accordingly.
  • psusi
    psusi almost 11 years
    The OP does not have a CR+NL, he has a CR followed by more text. dos2unix won't remove such a CR because it isn't immediately followed by a NL.
  • Ram
    Ram almost 11 years
    @psusi Thank you. In java code how to detect whether terminal or not ???