linefeed character reading in java
Solution 1
Java internally works with Unicode.
The Unicode standard defines a large number of characters that conforming applications should recognize as line terminators:[3]
LF: Line Feed, U+000A
VT: Vertical Tab, U+000B
FF: Form Feed, U+000C
CR: Carriage Return, U+000D
CR+LF: CR (U+000D) followed by LF (U+000A)
NEL: Next Line, U+0085
LS: Line Separator, U+2028
PS: Paragraph Separator, U+2029
(http://en.wikipedia.org/wiki/Newline)
That's why it interprets \n
as newline.
Solution 2
The character \n
is 0a
(carriage return). If you split Windows line separators by \n
only you'll split on the 0a
, leaving the 0d
characters behind.
Notepad shows 0a
as a square, but it will render 0d0a
as a newline.
Here's an example using Scala (it's Java under the covers) on Windows:
scala> "123\n456".split(System.getProperty("line.separator")).length
res1: Int = 1
scala> "123\n456".split("\r\n").length // same as the line above on Windows
res2: Int = 1
scala> "123\n456".split("\n").length
res3: Int = 2
Ravi.Kumar
Updated on July 25, 2022Comments
-
Ravi.Kumar almost 2 years
I am wondering that when I open a file in notepad. I see a continuous line without any carriage return/line feed.
I made a java program to read the file. When I split the data from file by using
\n
orSystem.getProperty("line.separator");
. I see lots of lines.I found in hex editor that file has '0A' for new line ( used in UNIX ) and it appears as a rectangle in Notepad.
Well, my question is that if it doesn't have '0D' and 'OA' ( used in Windows for carriage return and line feed ). How my java program is splitting the data into lines? It should not split it.
Anyone have any idea?
-
JIV almost 12 yearsexactly, for windows new line, you would split it by "\r\n" , not just "\n"
-
Synesso almost 12 yearsI'm not sure this is right.
"123\n456".split(System.getProperty("line.separator"))
is an array of size 1 on Windows. -
Marc-Christian Schulze almost 12 yearsString's split method accepts a regular expression where
\r\n
doesn't match\n
but in aJTextArea
the line breaks should render fine - same should apply forSystem.out.print
. -
Ravi.Kumar almost 12 yearsThank you for the explanation.
-
Ravi.Kumar almost 12 yearsThis is what documented in Java's Buffered reader readline() method : Read a line of text. A line is considered to be terminated by any one * of a line feed ('\n'), a carriage return ('\r'), or a carriage return * followed immediately by a linefeed.