linefeed character reading in java

10,743

Solution 1

Java internally works with Unicode.

The Unicode standard defines a large number of characters that conforming applications should recognize as line terminators:[3]
LF: Line Feed, U+000A
VT: Vertical Tab, U+000B
FF: Form Feed, U+000C
CR: Carriage Return, U+000D
CR+LF: CR (U+000D) followed by LF (U+000A)
NEL: Next Line, U+0085
LS: Line Separator, U+2028
PS: Paragraph Separator, U+2029

(http://en.wikipedia.org/wiki/Newline) That's why it interprets \n as newline.

Solution 2

The character \n is 0a (carriage return). If you split Windows line separators by \n only you'll split on the 0a, leaving the 0d characters behind.

Notepad shows 0a as a square, but it will render 0d0a as a newline.

Here's an example using Scala (it's Java under the covers) on Windows:

scala> "123\n456".split(System.getProperty("line.separator")).length
res1: Int = 1

scala> "123\n456".split("\r\n").length  // same as the line above on Windows
res2: Int = 1

scala> "123\n456".split("\n").length
res3: Int = 2
Share:
10,743
Ravi.Kumar
Author by

Ravi.Kumar

Updated on July 25, 2022

Comments

  • Ravi.Kumar
    Ravi.Kumar almost 2 years

    I am wondering that when I open a file in notepad. I see a continuous line without any carriage return/line feed.

    I made a java program to read the file. When I split the data from file by using \n or System.getProperty("line.separator");. I see lots of lines.

    I found in hex editor that file has '0A' for new line ( used in UNIX ) and it appears as a rectangle in Notepad.

    Well, my question is that if it doesn't have '0D' and 'OA' ( used in Windows for carriage return and line feed ). How my java program is splitting the data into lines? It should not split it.

    Anyone have any idea?

  • JIV
    JIV almost 12 years
    exactly, for windows new line, you would split it by "\r\n" , not just "\n"
  • Synesso
    Synesso almost 12 years
    I'm not sure this is right. "123\n456".split(System.getProperty("line.separator")) is an array of size 1 on Windows.
  • Marc-Christian Schulze
    Marc-Christian Schulze almost 12 years
    String's split method accepts a regular expression where \r\n doesn't match \n but in a JTextArea the line breaks should render fine - same should apply for System.out.print.
  • Ravi.Kumar
    Ravi.Kumar almost 12 years
    Thank you for the explanation.
  • Ravi.Kumar
    Ravi.Kumar almost 12 years
    This is what documented in Java's Buffered reader readline() method : Read a line of text. A line is considered to be terminated by any one * of a line feed ('\n'), a carriage return ('\r'), or a carriage return * followed immediately by a linefeed.