How to convert DOS/Windows newline (CRLF) to Unix newline (LF) in a Bash script

542,205

Solution 1

You can use tr to convert from DOS to Unix; however, you can only do this safely if CR appears in your file only as the first byte of a CRLF byte pair. This is usually the case. You then use:

tr -d '\015' <DOS-file >UNIX-file

Note that the name DOS-file is different from the name UNIX-file; if you try to use the same name twice, you will end up with no data in the file.

You can't do it the other way round (with standard 'tr').

If you know how to enter carriage return into a script (control-V, control-M to enter control-M), then:

sed 's/^M$//'     # DOS to Unix
sed 's/$/^M/'     # Unix to DOS

where the '^M' is the control-M character. You can also use the bash ANSI-C Quoting mechanism to specify the carriage return:

sed $'s/\r$//'     # DOS to Unix
sed $'s/$/\r/'     # Unix to DOS

However, if you're going to have to do this very often (more than once, roughly speaking), it is far more sensible to install the conversion programs (e.g. dos2unix and unix2dos, or perhaps dtou and utod) and use them.

If you need to process entire directories and subdirectories, you can use zip:

zip -r -ll zipfile.zip somedir/
unzip zipfile.zip

This will create a zip archive with line endings changed from CRLF to CR. unzip will then put the converted files back in place (and ask you file by file - you can answer: Yes-to-all). Credits to @vmsnomad for pointing this out.

Solution 2

Use:

tr -d "\r" < file

Take a look here for examples using sed:

# In a Unix environment: convert DOS newlines (CR/LF) to Unix format.
sed 's/.$//'               # Assumes that all lines end with CR/LF
sed 's/^M$//'              # In Bash/tcsh, press Ctrl-V then Ctrl-M
sed 's/\x0D$//'            # Works on ssed, gsed 3.02.80 or higher

# In a Unix environment: convert Unix newlines (LF) to DOS format.
sed "s/$/`echo -e \\\r`/"            # Command line under ksh
sed 's/$'"/`echo \\\r`/"             # Command line under bash
sed "s/$/`echo \\\r`/"               # Command line under zsh
sed 's/$/\r/'                        # gsed 3.02.80 or higher

Use sed -i for in-place conversion, e.g., sed -i 's/..../' file.

Solution 3

You can use Vim programmatically with the option -c {command}:

DOS to Unix:

vim file.txt -c "set ff=unix" -c ":wq"

Unix to DOS:

vim file.txt -c "set ff=dos" -c ":wq"

"set ff=unix/dos" means change fileformat (ff) of the file to Unix/DOS end of line format.

":wq" means write the file to disk and quit the editor (allowing to use the command in a loop).

Solution 4

Install dos2unix, then convert a file in-place with

dos2unix <filename>

To output converted text to a different file use

dos2unix -n <input-file> <output-file>

You can install it on Ubuntu or Debian with

sudo apt install dos2unix

or on macOS using Homebrew

brew install dos2unix

Solution 5

Using AWK you can do:

awk '{ sub("\r$", ""); print }' dos.txt > unix.txt

Using Perl you can do:

perl -pe 's/\r$//' < dos.txt > unix.txt
Share:
542,205
Koran Molovik
Author by

Koran Molovik

Updated on July 16, 2022

Comments

  • Koran Molovik
    Koran Molovik almost 2 years

    How can I programmatically (i.e., not using vi) convert DOS/Windows newlines to Unix?

    The dos2unix and unix2dos commands are not available on certain systems. How can I emulate these with commands like sed, awk, and tr?

    • Brad Koch
      Brad Koch over 8 years
      In general, just install dos2unix using your package manager, it really is much simpler and does exist on most platforms.
    • SmileIT
      SmileIT about 6 years
      Agreed! @BradKoch Simple as 'brew install dos2unix' on Mac OSX
    • bsd
      bsd over 2 years
      Not all users have root access, and thus cannot install packages. Maybe that's why the user asked the very specific question he asked.
  • Matt Todd
    Matt Todd over 13 years
    I used a variant since my file only had \r : tr "\r" "\n" < infile > outfile
  • n611x007
    n611x007 over 10 years
    @MattTodd could you post this as an answer? the -d is featured more frequently and will not help in the "only \r" situation.
  • Buttle Butkus
    Buttle Butkus over 10 years
    using tr -d '\015' <DOS-file >UNIX-file where DOS-file == UNIX-file just results in an empty file. The output file has to be a different file, unfortunately.
  • Jonathan Leffler
    Jonathan Leffler over 10 years
    @ButtleButkus: Well, yes; that's why I used two different names. If you zap the input file before the program reads it all, as you do when you use the same name twice, you end up with an empty file. That is uniform behaviour on Unix-like systems. It requires special code to handle overwriting an input file safely. Follow the instructions and you will be OK.
  • Buttle Butkus
    Buttle Butkus over 10 years
    I seem to remember in-file search-replace functionality somehwere.
  • Jonathan Leffler
    Jonathan Leffler over 10 years
    There are places; you have to know where to find them. Within limits, the GNU sed option -i (for in-place) works; the limits are linked files and symlinks. The sort command has 'always' (since 1979, if not earlier) supported the -o option which can list one of the input files. However, that is in part because sort must read all its input before it can write any of its output. Other programs sporadically support overwriting one of their input files. You can find a general purpose program (script) to avoid problems in 'The UNIX Programming Environment' by Kernighan & Pike.
  • augurar
    augurar over 10 years
    Any way to do this in a streaming fashion, without modifying the original file?
  • Jonathan Leffler
    Jonathan Leffler about 10 years
    Note that the proposed \r to \n mapping has the effect of double-spacing the files; each single CRLF line ending in DOS becomes \n\n in Unix.
  • n611x007
    n611x007 over 9 years
    @augurar you may check "similar packages" packages.debian.org/wheezy/flip
  • Warren Dew
    Warren Dew over 9 years
    The third option worked for me, thanks. I did use the -i option: sed -i $'s/\r$//' filename - to edit in place. I am working on a machine that does not have access to the internet, so software installation is a problem.
  • hlin117
    hlin117 about 9 years
    This answer really doesn't the original poster's question.
  • mklement0
    mklement0 about 9 years
    A nice, portable awk solution.
  • mklement0
    mklement0 about 9 years
    That's handy, but just to be clear: this translates Unix -> Windows/DOS, which is the opposite direction of what the OP asked for.
  • nawK
    nawK about 9 years
    It was done on purpose, left as an exercise for the author. eyerolls awk -v RS='\r\n' '1' dos.txt > unix.txt
  • mklement0
    mklement0 about 9 years
    Great (and kudos to you for pedagogic finesse).
  • mklement0
    mklement0 about 9 years
    "b/c awk requires one when given option." - awk always requires a program, whether options are specified or not.
  • mklement0
    mklement0 about 9 years
    The pure bash solution is interesting, but much slower than an equivalent awk or sed solution. Also, you must use while IFS= read -r line to faithfully preserve the input lines, otherwise leading and trailing whitespace is trimmed (alternatively, use no variable name in the read command and work with $REPLY).
  • dannysauer
    dannysauer almost 9 years
    Why clear $IFS? If you read in to one variable (or none, and implicitly $READ), read just splits on line endings, and you can just use echo instead of printf (echo is more likely to be a builtin, and it's generally faster). So, using ctrl-v+ctrl-m to type the \r, one can simply do while read -r; do echo "${REPLY%^M}"; done < file > file.fixed and it's about the same speed as sed.
  • jfs
    jfs almost 9 years
    The usage is misleading. The real dos2unix converts all input files by default. Your usage implies -n parameter. And the real dos2unix is a filter that reads from stdin, writes to stdout if the files are not given.
  • Melebius
    Melebius over 8 years
    This will convert every single DOS-newline into two UNIX-newlines.
  • Gordon Davisson
    Gordon Davisson over 8 years
    @LudovicZenohateLagouardette Was it a plain text file (i.e. csv or tab-demited text), or something else? If it was in some database-ish format, manipulating it as if it was text is very likely to corrupt its internal structure.
  • Ludovic Zenohate Lagouardette
    Ludovic Zenohate Lagouardette over 8 years
    A plain text csv, but I think the enconding was strange. I think it messed up because of that. However don't worry. I am always collecting backups an this wasn't even the real dataset, just a 1gb one. The real is a 26gb.
  • askewchan
    askewchan about 8 years
    OS X users should not use -c mac, which is for converting pre-OS X CR-only newlines. You want to use that mode only for files to and from Mac OS 9 or before.
  • kethinov
    kethinov about 8 years
    Here's how to do it without changing the filename: tr -d '\015' < original_file > t && mv t original_file - basically works by creating temp file, then overwriting the old one with it.
  • t0rst
    t0rst over 7 years
    @JonathanLeffler fyi, for macOS users: sed does not (by default, not sure you can change this?) recognise the escaped versions \r, \015, \x0d for carriage return; sed does recognise CR when entered with ctrl-v ctrl-m as described above (👍😊), which is ok for the command line; for scripts try sed "s/$(printf '\r')$//" (hat tip @twm), or fallback to tr, which recognises \r and \015.
  • Dorian
    Dorian about 7 years
    That's the one that worked for me (MacOS, git diff shows ^M, edited in vim)
  • eush77
    eush77 about 7 years
    @JonathanLeffler The general-purpose program is called sponge and can be found in moreutils: tr -d '\015' < original_file | sponge original_file. I use it daily.
  • Rolf
    Rolf over 6 years
    Thank you! This works, although I'm writing the filename and no --. I chose this solution because it's easy to understand and adapt for me. FYI, this is what the switches do: -p assume a "while input" loop, -i edit input file in place, -e execute following command
  • tripleee
    tripleee over 6 years
    Strictly speaking, PCRE is a reimplementation of Perl's regex engine, not the regex engine from Perl. They both have this capability, though there are also differences, in spite of the impication in the name.
  • A_P
    A_P over 5 years
    I had an experience of breaking half of my OS just by running texxto with a wrong flag. Be careful especially if you want to do it on entire folders.
  • zzxyz
    zzxyz over 5 years
    Hey John Paul--this answer got flagged for deletion so came up in a review queue for me. In general, when you've got a question like this that's 8 years old, with 22 answers, you'll want to explain how your answer is useful in a way that other existing answers are not.
  • Aaron Franke
    Aaron Franke over 5 years
    How do I do this recursively?
  • Aaron Franke
    Aaron Franke over 5 years
    Can I do this recursively?
  • Jonathan Leffler
    Jonathan Leffler over 5 years
    @AaronFranke: it depends on what your scenario looks like. In my book, if you need to modify a whole lot of files the same way, you use a script to encapsulate the processing (even if you throw it away after you've finished), and then use a tool such as find to identify the files that need changing (or otherwise create a list of file names — one hopes they don't have spaces and other unruly punctuation in the names) and then apply the script to the files. Using find … -exec sh script.sh {} + is pretty effective. The alternatives are legion. The find technique works with absurd names.
  • Boris Verkhovskiy
    Boris Verkhovskiy almost 5 years
    I know the question asks for alternatives to dos2unix but it's the first google result.
  • JosephConrad
    JosephConrad almost 5 years
    you can use ":x" instead of ":wq"
  • caram
    caram about 4 years
    Best answer, according to me, as it can process entire directories and sub-directories. I'm glad I digged that far down.
  • Davidius
    Davidius over 3 years
    if you accidantly apply sed $'s/$/\r/' twice it will have the CR twice. For scripting solutions I recommend the following: sed 's/^$/\r/;s/\([^\r]\)$/\1\r/g' For simplicity I would state this as the third way to make the original idea point out.
  • Peter Mortensen
    Peter Mortensen about 3 years
    The link seems to be broken (times out - "504 Gateway Time-out").
  • Peter Mortensen
    Peter Mortensen about 3 years
    The hintsforums.macworld.com link is (effectively) broken - it redirects to the main page, "hints.macworld.com"
  • ndim
    ndim almost 3 years
    On a LF type system like GNU/Linux, sed "" will not do the trick, though.
  • user9645
    user9645 over 2 years
    Your command put an extra blank line in between every line when converting a DOS file. Doing this awk 'BEGIN{RS="\r\n";ORS=""}{print}' dosfile > unixfile fixed that issue, but it still does not fix the missing EOL on the last line.
  • user9645
    user9645 over 2 years
    Also, this won't work on some platforms since there is no python -- they apparently can't be bothered with backward compatibility, so it is python2 or python3 or ...
  • Neil C. Obremski
    Neil C. Obremski over 2 years
    I could not get this to work when adding --in-place mydosfile.txt to the end (or piping to a file). The end result was the file still had CRLF. I was testing on a Graviton (AArch64) EC2 instance.
  • John Paul
    John Paul over 2 years
    @NeilC.Obremski I updated with full command line, please try that. It will also make a backup before change.
  • zhenguoli
    zhenguoli over 2 years
    sed 's/\r\n/\n/g' does not match anything. Refer to can-sed-replace-new-line-characters
  • John Paul
    John Paul over 2 years
    It worked for me.
  • mbomb007
    mbomb007 almost 2 years
    The zip file method works really well!