How to convert DOS/Windows newline (CRLF) to Unix newline (LF) in a Bash script
Solution 1
You can use tr
to convert from DOS to Unix; however, you can only do this safely if CR appears in your file only as the first byte of a CRLF byte pair. This is usually the case. You then use:
tr -d '\015' <DOS-file >UNIX-file
Note that the name DOS-file
is different from the name UNIX-file
; if you try to use the same name twice, you will end up with no data in the file.
You can't do it the other way round (with standard 'tr').
If you know how to enter carriage return into a script (control-V, control-M to enter control-M), then:
sed 's/^M$//' # DOS to Unix
sed 's/$/^M/' # Unix to DOS
where the '^M' is the control-M character. You can also use the bash
ANSI-C Quoting mechanism to specify the carriage return:
sed $'s/\r$//' # DOS to Unix
sed $'s/$/\r/' # Unix to DOS
However, if you're going to have to do this very often (more than once, roughly speaking), it is far more sensible to install the conversion programs (e.g. dos2unix
and unix2dos
, or perhaps dtou
and utod
) and use them.
If you need to process entire directories and subdirectories, you can use zip
:
zip -r -ll zipfile.zip somedir/
unzip zipfile.zip
This will create a zip archive with line endings changed from CRLF to CR. unzip
will then put the converted files back in place (and ask you file by file - you can answer: Yes-to-all). Credits to @vmsnomad for pointing this out.
Solution 2
Use:
tr -d "\r" < file
Take a look here for examples using sed
:
# In a Unix environment: convert DOS newlines (CR/LF) to Unix format.
sed 's/.$//' # Assumes that all lines end with CR/LF
sed 's/^M$//' # In Bash/tcsh, press Ctrl-V then Ctrl-M
sed 's/\x0D$//' # Works on ssed, gsed 3.02.80 or higher
# In a Unix environment: convert Unix newlines (LF) to DOS format.
sed "s/$/`echo -e \\\r`/" # Command line under ksh
sed 's/$'"/`echo \\\r`/" # Command line under bash
sed "s/$/`echo \\\r`/" # Command line under zsh
sed 's/$/\r/' # gsed 3.02.80 or higher
Use sed -i
for in-place conversion, e.g., sed -i 's/..../' file
.
Solution 3
You can use Vim programmatically with the option -c {command}
:
DOS to Unix:
vim file.txt -c "set ff=unix" -c ":wq"
Unix to DOS:
vim file.txt -c "set ff=dos" -c ":wq"
"set ff=unix/dos" means change fileformat (ff) of the file to Unix/DOS end of line format.
":wq" means write the file to disk and quit the editor (allowing to use the command in a loop).
Solution 4
Install dos2unix
, then convert a file in-place with
dos2unix <filename>
To output converted text to a different file use
dos2unix -n <input-file> <output-file>
You can install it on Ubuntu or Debian with
sudo apt install dos2unix
or on macOS using Homebrew
brew install dos2unix
Solution 5
Using AWK you can do:
awk '{ sub("\r$", ""); print }' dos.txt > unix.txt
Using Perl you can do:
perl -pe 's/\r$//' < dos.txt > unix.txt
Koran Molovik
Updated on July 16, 2022Comments
-
Koran Molovik almost 2 years
How can I programmatically (i.e., not using
vi
) convert DOS/Windows newlines to Unix?The
dos2unix
andunix2dos
commands are not available on certain systems. How can I emulate these with commands likesed
,awk
, andtr
?-
Brad Koch over 8 yearsIn general, just install
dos2unix
using your package manager, it really is much simpler and does exist on most platforms. -
SmileIT about 6 yearsAgreed! @BradKoch Simple as 'brew install dos2unix' on Mac OSX
-
bsd over 2 yearsNot all users have root access, and thus cannot install packages. Maybe that's why the user asked the very specific question he asked.
-
-
Matt Todd over 13 yearsI used a variant since my file only had
\r
:tr "\r" "\n" < infile > outfile
-
n611x007 over 10 years@MattTodd could you post this as an answer? the
-d
is featured more frequently and will not help in the "only\r
" situation. -
Buttle Butkus over 10 yearsusing
tr -d '\015' <DOS-file >UNIX-file
whereDOS-file
==UNIX-file
just results in an empty file. The output file has to be a different file, unfortunately. -
Jonathan Leffler over 10 years@ButtleButkus: Well, yes; that's why I used two different names. If you zap the input file before the program reads it all, as you do when you use the same name twice, you end up with an empty file. That is uniform behaviour on Unix-like systems. It requires special code to handle overwriting an input file safely. Follow the instructions and you will be OK.
-
Buttle Butkus over 10 yearsI seem to remember in-file search-replace functionality somehwere.
-
Jonathan Leffler over 10 yearsThere are places; you have to know where to find them. Within limits, the GNU
sed
option-i
(for in-place) works; the limits are linked files and symlinks. Thesort
command has 'always' (since 1979, if not earlier) supported the-o
option which can list one of the input files. However, that is in part becausesort
must read all its input before it can write any of its output. Other programs sporadically support overwriting one of their input files. You can find a general purpose program (script) to avoid problems in 'The UNIX Programming Environment' by Kernighan & Pike. -
augurar over 10 yearsAny way to do this in a streaming fashion, without modifying the original file?
-
Jonathan Leffler about 10 yearsNote that the proposed
\r
to\n
mapping has the effect of double-spacing the files; each single CRLF line ending in DOS becomes\n\n
in Unix. -
n611x007 over 9 years@augurar you may check "similar packages" packages.debian.org/wheezy/flip
-
Warren Dew over 9 yearsThe third option worked for me, thanks. I did use the -i option:
sed -i $'s/\r$//' filename
- to edit in place. I am working on a machine that does not have access to the internet, so software installation is a problem. -
hlin117 about 9 yearsThis answer really doesn't the original poster's question.
-
mklement0 about 9 yearsA nice, portable
awk
solution. -
mklement0 about 9 yearsThat's handy, but just to be clear: this translates Unix -> Windows/DOS, which is the opposite direction of what the OP asked for.
-
nawK about 9 yearsIt was done on purpose, left as an exercise for the author. eyerolls
awk -v RS='\r\n' '1' dos.txt > unix.txt
-
mklement0 about 9 yearsGreat (and kudos to you for pedagogic finesse).
-
mklement0 about 9 years"b/c awk requires one when given option." - awk always requires a program, whether options are specified or not.
-
mklement0 about 9 yearsThe pure bash solution is interesting, but much slower than an equivalent
awk
orsed
solution. Also, you must usewhile IFS= read -r line
to faithfully preserve the input lines, otherwise leading and trailing whitespace is trimmed (alternatively, use no variable name in theread
command and work with$REPLY
). -
dannysauer almost 9 yearsWhy clear $IFS? If you read in to one variable (or none, and implicitly
$READ
), read just splits on line endings, and you can just use echo instead of printf (echo is more likely to be a builtin, and it's generally faster). So, using ctrl-v+ctrl-m to type the \r, one can simply dowhile read -r; do echo "${REPLY%^M}"; done < file > file.fixed
and it's about the same speed as sed. -
jfs almost 9 yearsThe usage is misleading. The real
dos2unix
converts all input files by default. Your usage implies-n
parameter. And the realdos2unix
is a filter that reads from stdin, writes to stdout if the files are not given. -
Melebius over 8 yearsThis will convert every single DOS-newline into two UNIX-newlines.
-
Gordon Davisson over 8 years@LudovicZenohateLagouardette Was it a plain text file (i.e. csv or tab-demited text), or something else? If it was in some database-ish format, manipulating it as if it was text is very likely to corrupt its internal structure.
-
Ludovic Zenohate Lagouardette over 8 yearsA plain text csv, but I think the enconding was strange. I think it messed up because of that. However don't worry. I am always collecting backups an this wasn't even the real dataset, just a 1gb one. The real is a 26gb.
-
askewchan about 8 yearsOS X users should not use
-c mac
, which is for converting pre-OS XCR
-only newlines. You want to use that mode only for files to and from Mac OS 9 or before. -
kethinov about 8 yearsHere's how to do it without changing the filename:
tr -d '\015' < original_file > t && mv t original_file
- basically works by creating temp file, then overwriting the old one with it. -
t0rst over 7 years@JonathanLeffler fyi, for macOS users:
sed
does not (by default, not sure you can change this?) recognise the escaped versions\r
,\015
,\x0d
for carriage return;sed
does recognise CR when entered withctrl-v ctrl-m
as described above (👍😊), which is ok for the command line; for scripts trysed "s/$(printf '\r')$//"
(hat tip @twm), or fallback totr
, which recognises\r
and\015
. -
Dorian about 7 yearsThat's the one that worked for me (MacOS,
git diff
shows ^M, edited in vim) -
eush77 about 7 years@JonathanLeffler The general-purpose program is called
sponge
and can be found in moreutils:tr -d '\015' < original_file | sponge original_file
. I use it daily. -
Rolf over 6 yearsThank you! This works, although I'm writing the filename and no
--
. I chose this solution because it's easy to understand and adapt for me. FYI, this is what the switches do:-p
assume a "while input" loop,-i
edit input file in place,-e
execute following command -
tripleee over 6 yearsStrictly speaking, PCRE is a reimplementation of Perl's regex engine, not the regex engine from Perl. They both have this capability, though there are also differences, in spite of the impication in the name.
-
A_P over 5 yearsI had an experience of breaking half of my OS just by running texxto with a wrong flag. Be careful especially if you want to do it on entire folders.
-
zzxyz over 5 yearsHey John Paul--this answer got flagged for deletion so came up in a review queue for me. In general, when you've got a question like this that's 8 years old, with 22 answers, you'll want to explain how your answer is useful in a way that other existing answers are not.
-
Aaron Franke over 5 yearsHow do I do this recursively?
-
Aaron Franke over 5 yearsCan I do this recursively?
-
Jonathan Leffler over 5 years@AaronFranke: it depends on what your scenario looks like. In my book, if you need to modify a whole lot of files the same way, you use a script to encapsulate the processing (even if you throw it away after you've finished), and then use a tool such as
find
to identify the files that need changing (or otherwise create a list of file names — one hopes they don't have spaces and other unruly punctuation in the names) and then apply the script to the files. Usingfind … -exec sh script.sh {} +
is pretty effective. The alternatives are legion. Thefind
technique works with absurd names. -
Boris Verkhovskiy almost 5 yearsI know the question asks for alternatives to dos2unix but it's the first google result.
-
JosephConrad almost 5 yearsyou can use ":x" instead of ":wq"
-
caram about 4 yearsBest answer, according to me, as it can process entire directories and sub-directories. I'm glad I digged that far down.
-
Davidius over 3 yearsif you accidantly apply
sed $'s/$/\r/'
twice it will have the CR twice. For scripting solutions I recommend the following:sed 's/^$/\r/;s/\([^\r]\)$/\1\r/g'
For simplicity I would state this as the third way to make the original idea point out. -
Peter Mortensen about 3 yearsThe link seems to be broken (times out - "504 Gateway Time-out").
-
Peter Mortensen about 3 yearsThe hintsforums.macworld.com link is (effectively) broken - it redirects to the main page, "hints.macworld.com"
-
ndim almost 3 yearsOn a LF type system like GNU/Linux,
sed ""
will not do the trick, though. -
user9645 over 2 yearsYour command put an extra blank line in between every line when converting a DOS file. Doing this
awk 'BEGIN{RS="\r\n";ORS=""}{print}' dosfile > unixfile
fixed that issue, but it still does not fix the missing EOL on the last line. -
user9645 over 2 yearsAlso, this won't work on some platforms since there is no
python
-- they apparently can't be bothered with backward compatibility, so it ispython2
orpython3
or ... -
Neil C. Obremski over 2 yearsI could not get this to work when adding
--in-place mydosfile.txt
to the end (or piping to a file). The end result was the file still had CRLF. I was testing on a Graviton (AArch64) EC2 instance. -
John Paul over 2 years@NeilC.Obremski I updated with full command line, please try that. It will also make a backup before change.
-
zhenguoli over 2 years
sed 's/\r\n/\n/g'
does not match anything. Refer to can-sed-replace-new-line-characters -
John Paul over 2 yearsIt worked for me.
-
mbomb007 almost 2 yearsThe zip file method works really well!