How are \n and \r handled differently on Linux and Windows?

141,849

Solution 1

I think \n moves the needle down, and \r moves the needle to the beginning of a line (left align)? I'm not sure, though

This is true, more or less, but mostly a historical curiosity. Originally, linefeed (LF) was used to advance the paper by one line on printers and hardcopy terminals (teleprinters); carriage return (CR) returned the print head to the start of the line.

This probably still works on modern printers when used in "text mode", but is otherwise of little relevance today.

Anyway, I was told that Windows and Linux handle newlines and carriage returns differently.

The difference is simply: OS designers had to choose how to represent the start of a new line in text in computer files. For various historical reasons, in the Unix/Linux world a single LF character was chosen as the newline marker; MS-DOS chose CR+LF, and Windows inherited this. Thus different platforms use different conventions.

In practice, this is becoming less and less of a problem. The newline marker is really only relevant for pograms that process "plain text", and there are not that many - it mostly only affects program source code, configuration files, and some simple text files with documentation. Nowadays most programs handling these kinds of files (editors, compilers etc.) can handle both newline conventions, so it does not matter which one you choose.

There are some cases where tools insist on "their" newline convention (e.g. Unix shell scripts must not use CR+LF), in which case you must use the right one.

Solution 2

CR and LF

The American Standard Code for Information Interchange (ASCII) defined control-characters including CARRIAGE-RETURN (CR) and LINE-FEED (LF) that were (and still are) used to control the print-position on printers in a way analogous to the mechanical typewriters that preceded early computer printers.

Platform dependency

In Windows the traditional line-separator in text files is CR followed by LF

In old (pre OSX) Apple Macintosh systems the traditional line separator in text files was CR

In Unix and Linux, the traditional line-separator in text files is LF.

\n and \r

In many programming and scripting languages \n means "new line". Sometimes (but not always) this means the ASCII LINE-FEED character (LF), which, as you say, moves the cursor (or print position) down one line. In a printer or typewriter, this would actually move the paper up one line.

Invariably \r means the ASCII CARRIAGE-RETURN character (CR) whose name actually comes from mechanical typewriters where there was a carriage-return key that caused the roller ("carriage") that carried the paper to move to the right, powered by a spring, as far as it would go. Thus setting the current typing position to the left margin.

Programming

In some programming languages \n can mean a platform-dependent sequence of characters that end or separate lines in a text file. For example in Perl, print "\n" produces a different sequence of characters on Linux than on Windows.

In Java, best practise, if you want to use the native line endings for the runtime platform, is not to use \n or \r at all. You should use System.getProperty("line.separator"). You should use \n and \r where you want LF and CR regardless of platform (e.g. as used in HTTP, FTP and other Internet communications protocols).

Unix stty

In a Unix shell, the stty command can be used to cause the shell to translate between these various conventions. For example stty -onlcr will cause the shell to subsequently translate all outgoing LFs to CR LF.

Linux and OSX follow Unix conventions

Text files

Text files are still enormously important and widely used. For example, HTML and XML are examples of text file. Most of the important Internet protocols, such as HTTP, follow text-file conventions and include specifications for line-endings.

Printers

Most printers other than the very cheapest, still respect CR and LF. In fact they are fundamental to the most widely used page description languages - PCL and Postscript.

Solution 3

In short, was needed for printers, but now the OSes do it slightly differently. In most cases, it is fine to just do both CR and LF by doing \r\n and in most cases, this will work fine.

Solution 4

Linux does not ignore \r. Think of what it does. You can carriage return multiple times, you'll still end up at the same place, the beginning of the line.

Share:
141,849

Related videos on Youtube

千里ちゃん
Author by

千里ちゃん

Updated on September 18, 2022

Comments

  • 千里ちゃん
    千里ちゃん over 1 year

    I think \n moves the needle down, and \r moves the needle to the beginning of a line (left align)? I'm not sure, though. So, if I'm wrong please correct me....

    Anyway, I was told that Windows and Linux handle newlines and carriage returns differently. I would like to know how they handle them differently and some places where it's important to remember. Thanks for answering.

    • Admin
      Admin over 12 years
      Don't call them \r and \n, since how \n is handled depends on where you're using it. Better to call them CR and LF.
    • Admin
      Admin over 12 years
      Ignacio, those acronyms have no meaning to me. What do you call this :/? OH... LINE FEED and CARRIAGE RETURN. Thanks, sleske.
    • Admin
      Admin over 12 years
      @IgnacioVazquez-Abrams Is \n not identical to LF? On any ASCII chart, isn't character 13=\n=LF ?
    • Admin
      Admin over 12 years
      @barlop: Not in C when outputting in Windows.
    • Admin
      Admin over 12 years
      @IgnacioVazquez-Abrams C doesn't rewrite the ASCII table though. I agree \n may not function as a line feed but that doesn't mean it's not the LF character. (more of a question since I know you know more than me)
    • Admin
      Admin over 12 years
      @IgnacioVazquez-Abrams \n probably stands for new line. i.e. LF. How they act is another matter. In Unix, \n or LF isn't just a line feed anyway, it does a carriage return function too, but it's still called a line feed character.
    • Admin
      Admin over 12 years
      @barlop: To be perfectly fair, \n doesn't have any real meaning outside of C (and other programming languages that interpret it). The character sequence "\n" wouldn't even mean anything at all if not for C.
  • 千里ちゃん
    千里ちゃん over 12 years
    Same line of questioning: do programming languages recognize \n\r and \n as being the same? For example, if I were parsing a text file that was edited on someone else's PC and contained both the Linux and Windows version of line breaks, would performing a preg_match for \n and \n\r give me different results?
  • sleske
    sleske over 12 years
    @千里ちゃん: This totally depends on the programming language, compiler etc. In particular, if you use regexes, it will depend on the regex engine you use - some distinguish different line endings, some do no (most can be configured either way, I believe).
  • sleske
    sleske over 12 years
    @千里ちゃん: If you have a question on how some system/programming language/regular expression engine handles different newline conventions, just ask this as a separate question.
  • barlop
    barlop over 12 years
    you should be writing \r\n not the wrong way round as you are. As to programming languages, they would be able to read individual characters and you the programmer can see which is used in the input, and you the programmer can also do as you wish for the output. Just as you could say "Write ABC followed by \r\r\r\n" whatever characters you want to stick on the end! some other characters may be non printable and no graphical or whatever. They may have some built in functions like println, and what they use for their new line would be one or the other, it can't be both.
  • barlop
    barlop over 12 years
    @千里ちゃん and some programming languages may let you choose which lnie ending as a setting in one of their built in functions, so even in a built in function you could.. in theory anyway. + as mentioned, in practice you can write whatever line ending you want... though you may not be able to do so efficiently like with a println function.
  • sleske
    sleske over 12 years
    Note on Java: It's not generally true that you should "not use \n or \r at all". It's just that in Java, "\n" is always LF, and "\r" is always CR. This may be just what you want: If you want a specific line ending style, use them; if you explicitly want the native line ending of the computer you are running on, then use line.separator. It really depends on what you want.
  • sleske
    sleske over 12 years
    And BTW, println() automatically uses line.separator, so if you want native line endings, you can use println() (and if you need a certain specific type of line ending, then don't use it, but use "\n" etc. explicitly).
  • user5249203
    user5249203 over 12 years
    @sleske: Good points. I'll update my answer accordingly.
  • jcrawfordor
    jcrawfordor over 12 years
    Barring regexes, most programming environments that I work with (and presumably most sane programming environments) will handle this problem automatically. Always use \n on its own and either LF or CRLF will be output depending on what is correct in the current environment (or, heck, LFCR if you're on some wacky Sun something). Using \r\n in programs is the worse idea, because under the compilers I'm familiar with that would result in CRLF in *nix (bad) and CRCRLF in Windows (bad). Java is the one exception I know of (and I only remembered that by reading another comment re this question).
  • Keith Thompson
    Keith Thompson over 12 years
    Are there any languages or compilers where \n is a control character other than ASCII LF (other than EBCDIC-based systems)? I'm referring to what \n means in a string or character literal, not to the effect of sending it to a file or output device.
  • user5249203
    user5249203 over 12 years
    @KeithThompson: According to Wikipedia The C standard allows \n to be represented by any single char value.
  • Keith Thompson
    Keith Thompson over 12 years
    @RedGrittyBrick: With some constraints (it has to be unique, for example). But my question was about implementations, not the standard. For example, a pre-OSX MacOS C compiler might have '\n' equal to CR -- but then '\r' would have to be something else (LF?). Is '\n' == 10' universally true for non-EBCDIC systems? (Certainly well-written code shouldn't assume it.)
  • sleske
    sleske over 12 years
    @KeithThompson: For Java: Yes, \n is always ASCII (and Unicode) code 10, because the JLS says so explicitly (JLS 3.10.6, "Escape Sequences for Character and String Literals" - I checked :-)). For other languages -- good question.
  • sleske
    sleske over 12 years
    @KeithThompson: Consider asking this as a separate question on SO. It's an interesting problem.
  • seangwright
    seangwright about 9 years
    I have been exploring an issue with Chrome dev tools, debugging & breakpoints ( code.google.com/p/v8/issues/detail?id=2825#c33 ) and I think it is related to the way newlines are handled by chrome dev tools and how linux (Git in this case) is normalizing them.
  • Aaron Franke
    Aaron Franke over 4 years
    Does Linux just ignore the \r or does it cause some kind of behavior change?