How to remove all CRLF in file (not replace with LF)

7,702

Solution 1

sed ":a;/\r$/{N;s/\r\n//;b a}"

This will match all lines that have '\r' at the end (followed by '\n'). On these lines it will first append the next line of input (while putting the '\n separator in between), then replace the resulting "\r\n" with an empty string, and then goes back to the beginning to see, whether the new contents of pattern space doesn't by chance happen to match again.

Following the comment: if you wanted to strip any additional '\r' from the file as well, just add it after stripping the CRLF combos:

sed ":a;/\r$/{$!N;s/\r\n//;t a};s/\r//g"

Solution 2

I tend to reach for perl one-liners when doing anything that involves manipulating line endings:

perl -pe 'BEGIN {undef $/} s/\r\n//g' *.txt

The key to making this work is the undef $/, which makes Perl read each file as one string, which you can then do a search-and-replace on. To strip bare \r as well, just tweak the regex:

perl -pe 'BEGIN {undef $/} s/\r\n?//g' *.txt
Share:
7,702

Related videos on Youtube

user779159
Author by

user779159

Updated on September 18, 2022

Comments

  • user779159
    user779159 over 1 year

    I'd like to remove all carriage returns followed by line feeds (CRLF), such as \r\n in a file. How can I do that? I can't use dos2unix because that replaces CRLF with LF. And I can't use tr because that will also replace any \n that aren't preceded by \r. How can I do this?

    • user779159
      user779159 over 9 years
      I tried sed -i 's/\r\n//g' file which didn't work
  • user779159
    user779159 over 9 years
    Cool! Btw is there a way to modify that command to strip all occurrences of \r in addition to \r\n from the file? (Rather than having to run a second command to get rid of \r using something like tr.)
  • user779159
    user779159 over 9 years
    Thanks peterph, it works great. (And mikeserv for an edit.) Putting 2 sed commands in the same command separated by a semicolon is more efficient than running them as 2 separate commands? Meaning it only has to scan the file through once and just runs both commands on each line?
  • Avinash Raj
    Avinash Raj over 9 years
    perl -pe 'BEGIN {undef $/} s/\r(?=\n)//g' *.txt
  • peterph
    peterph over 9 years
    Yes. Plus you save a couple of miliseconds on creating a new process.
  • peterph
    peterph over 9 years
    @mikeserv thanks for the edit. However the branch command needs modifying as well - the unconditional one I had there was causing an endless loop on the last line.
  • mikeserv
    mikeserv over 9 years
    Maybe like sed -e :n -e '/^M$/{$s///p;N' -e '};s/.\n//;tn' The problem is though that sed isnt designed for unlimited line length. If youre relying on gnu extensions -z is probably easiest: sed -z 's/\r\n//g'. But then youre working with long pattern spaces. At lest that way though you can clear it without printing a newline - so once per edit. And sed probably doesnt save much here - im willing to bet an additional tr would actually save processing time.
  • zwol
    zwol over 9 years
    @AvinashRaj That is the same as dos2unix, which is specifically not what the OP wanted.