Split Java String by New Line

627,321

Solution 1

This should cover you:

String lines[] = string.split("\\r?\\n");

There's only really two newlines (UNIX and Windows) that you need to worry about.

Solution 2

String#split​(String regex) method is using regex (regular expressions). Since Java 8 regex supports \R which represents (from documentation of Pattern class):

Linebreak matcher
\R         Any Unicode linebreak sequence, is equivalent to \u000D\u000A|[\u000A\u000B\u000C\u000D\u0085\u2028\u2029]

So we can use it to match:

As you see \r\n is placed at start of regex which ensures that regex will try to match this pair first, and only if that match fails it will try to match single character line separators.


So if you want to split on line separator use split("\\R").

If you don't want to remove from resulting array trailing empty strings "" use split(regex, limit) with negative limit parameter like split("\\R", -1).

If you want to treat one or more continues empty lines as single delimiter use split("\\R+").

Solution 3

If you don’t want empty lines:

String.split("[\\r\\n]+")

Solution 4

String.split(System.lineSeparator());

This should be system independent

Solution 5

A new method lines has been introduced to String class in , which returns Stream<String>

Returns a stream of substrings extracted from this string partitioned by line terminators.

Line terminators recognized are line feed "\n" (U+000A), carriage return "\r" (U+000D) and a carriage return followed immediately by a line feed "\r\n" (U+000D U+000A).

Here are a few examples:

jshell> "lorem \n ipusm \n sit".lines().forEach(System.out::println)
lorem
 ipusm
 sit

jshell> "lorem \n ipusm \r  sit".lines().forEach(System.out::println)
lorem
 ipusm
  sit

jshell> "lorem \n ipusm \r\n  sit".lines().forEach(System.out::println)
lorem
 ipusm
  sit

String#lines()

Share:
627,321
dr.manhattan
Author by

dr.manhattan

I am a Computer Science major at UCF. Currently, I'm struggling between deciding whether or not I want to pursue game programming or doing something else.

Updated on October 19, 2021

Comments

  • dr.manhattan
    dr.manhattan over 2 years

    I'm trying to split text in a JTextArea using a regex to split the String by \n However, this does not work and I also tried by \r\n|\r|n and many other combination of regexes. Code:

    public void insertUpdate(DocumentEvent e) {
        String split[], docStr = null;
        Document textAreaDoc = (Document)e.getDocument();
    
        try {
            docStr = textAreaDoc.getText(textAreaDoc.getStartPosition().getOffset(), textAreaDoc.getEndPosition().getOffset());
        } catch (BadLocationException e1) {
            // TODO Auto-generated catch block
            e1.printStackTrace();
        }
    
        split = docStr.split("\\n");
    }
    
  • Alan Moore
    Alan Moore over 15 years
    insertUpdate() is a DocumentListener method. Assuming the OP is using it right, trying to modify the document from within the listener method will generate an exception. But you're right: the code in that question doesn't actually do anything.
  • Alan Moore
    Alan Moore over 15 years
    A JTextArea document SHOULD use only '\n'; its Views completely ignore '\r'. But if you're going to look for more than one kind of separator, you might as well look for all three: "\r?\n|\r".
  • Alan Moore
    Alan Moore over 15 years
    Not really. When you write a regex in the form of a Java String literal, you can use "\n" to pass the regex compiler a linefeed symbol, or "\\n" to pass it the escape sequence for a linefeed. The same goes for all the other whitespace escapes except \v, which isn't supported in Java literals.
  • angryITguy
    angryITguy over 12 years
    double backslashes are unnecessary, see section "Backslashes, escapes, and quoting" docs.oracle.com/javase/1.4.2/docs/api/java/util/regex/…
  • angryITguy
    angryITguy over 12 years
    @Yuval. Sorry that is incorrect, you don't need it at all "Backslashes, escapes, and quoting" docs.oracle.com/javase/1.4.2/docs/api/java/util/regex/…
  • Gumbo
    Gumbo over 12 years
  • Maarten Bodewes
    Maarten Bodewes almost 12 years
    It's an interesting idea, but you should take care that the text actually uses the system's line separator. I've good many many text files under unix (e.g. XML) that uses "Windows" separators and quite a few under Windows that use unix separators.
  • Raekye
    Raekye about 11 years
    Mac 9 uses \r. OSX 10 uses \n
  • ruX
    ruX over 10 years
    Works even on android
  • Makoto
    Makoto about 10 years
    This pales in comparison to the other answers, which are more explanatory and less code-heavy. Could you explain what it is you're accomplishing with this code, and why it would make a suitable answer?
  • Admin
    Admin about 10 years
    ${fn:length(fn:split(data, '\\r?\\n'))} is not working in jstl
  • bvdb
    bvdb almost 10 years
    Files created in a Windows OS and transfered to a Unix OS will still contain \r\n seperators. I think it's better to play safe and take both seperators in account.
  • FeinesFabi
    FeinesFabi over 9 years
    Isn't it: 'String[] lines = String.split("\\r?\\n");' ?
  • John
    John over 9 years
    This worked on Mac OSX when the above answer did not.
  • Martin
    Martin over 9 years
    This is a very problematic approach! The file may not originate from the system running the code. I strongly discourage these kinds of "system independent" designs that actually depends on a specific system, the runtime system.
  • Martin
    Martin over 9 years
    This has nothing to do with splitting a file into lines. Consider removing your answer.
  • Shervin Asgari
    Shervin Asgari over 9 years
    @Martin if you have control over the deployed system, this is fine. However, if you are deploying your code to the cloud and have no control, then its not the best way to do it
  • Martin
    Martin over 9 years
    @Shervin It is never the best way to do it. It is in fact very bad practice. Consider some other programmer calling System.setProperty("line.separator", "you have no point"); Your code is broken. It might even be called similarly by a dependency you have no knowledge about.
  • logixplayer
    logixplayer almost 9 years
    This also worked for me. Excellent solution. It worked for the following 2 cases: 1) i woke up at 3 o clock.\r\n\r\nI hope 2) this is real life\r\nso I
  • Greg
    Greg almost 9 years
    This did not work as the file originated on Unix, and was being split on Windows.
  • greyseal96
    greyseal96 about 8 years
    This answer is exactly correct. One little suggestion would be that it might be helpful to add why it gets rid of the empty lines for people that might not be as familiar with regex and how it behaves. For anybody that might be wondering, it's because the "+" is a greedy operator and will match at least one but will continue to match the '\r\n' characters until it no longer can match them. See here: regular-expressions.info/repeat.html#greedy
  • Alan Moore
    Alan Moore about 8 years
    Yes, you do. If they need double-escaping anywhere, they need it everywhere. Whitespace escapes like \r and \n can have one or two backslashes; they work either way.
  • Pshemo
    Pshemo almost 8 years
    @antak yes, split by default removes trailing empty strings if they ware result of split. To turn this mechanism off you need to use overloaded version of split(regex, limit) with negative limit like text.split("\\r?\\n", -1). More info: Java String split removed empty values
  • nurchi
    nurchi almost 8 years
    The double backslash '\\' in code becomes a '\' character and is then passed to the RegEx engine, so "[\\r\\n]" in code becomes [\r\n] in memory and RegEx will process that. I don't know how exactly Java handles RegEx, but it is a good practice to pass a "pure" ASCII string pattern to the RegEx engine and let it process rather than passing binary characters. "[\r\n]" becomes (hex) 0D0A in memory and one RegEx engine might accept it while another will choke. So the bottom line is that even if Java's flavour of RegEx doesn't need them, keep double slashes for compatibility
  • ibai
    ibai over 7 years
    String[] lines = string.split(System.getProperty("line.separator")); This will work fine while you use strings generated in your same OS/app, but if for example you are running your java application under linux and you retrieve a text from a database that was stored as a windows text, then it could fail.
  • James McLaughlin
    James McLaughlin about 7 years
    The comment by @stivlo is misinformation, and it is unfortunate that it has so many upvotes. As @ Raekye pointed out, OS X (now known as macOS) has used \n as its line separator since it was released in 2001. Mac OS 9 was released in 1999, and I have never seen a Mac OS 9 or below machine used in production. There is not a single modern operating system that uses \r as a line separator. NEVER write code that expects \r to be the line separator on Mac, unless a) you're into retro computing, b) have an OS 9 machine spun up, and c) can reliably determine that the machine is actually OS 9.
  • Rop
    Rop almost 7 years
    @Martin -- "some other programmer calling System.setProperty("line.separator", "you have no point"); " --- Just wondering, wouldn't such idiocy/sabotage break a lot of expected behaviours in the JDK libraries, too?
  • Lealo
    Lealo almost 7 years
    And what does it mean?
  • Martin
    Martin almost 7 years
    @Rop I can't think of any cases right away, but there might exist dependencies to system properties that actually break code. I would strongly encourage configuration without use of system properties whenever possible.
  • Maykel Llanes Garcia
    Maykel Llanes Garcia over 6 years
    This answer did not work for me. I just use "String pieces[] = text.split("\n") " or "String pieces[] = text.split(System.getProperty("line.separator")) "on java 8.
  • Danilo Piazzalunga
    Danilo Piazzalunga over 6 years
    I know this may be an overkill solution.
  • Ted Hopp
    Ted Hopp about 6 years
    Or String[] lines = new BufferedReader(...).lines().toArray(String[]::new); for an array instead of a list. The nice thing about this solution is that BufferedReader knows about all kinds of like terminators, so it can handle text in all sorts of formats. (Most of the regex-based solutions posted here fall short in this regard.)
  • leventov
    leventov almost 6 years
    This solution is obsolete since Java 11 and the introduction of the String.lines() method.
  • john ktejik
    john ktejik over 5 years
    What about unicode?? A next-line character ('\u0085'), A line-separator character ('\u2028'), or A paragraph-separator character ('\u2029).
  • tresf
    tresf over 5 years
    Why not [\\r?\\n]+?
  • Dawood ibn Kareem
    Dawood ibn Kareem over 4 years
    Yes, it's the best answer. Unfortunate that the question was asked six years too early for this answer.
  • Breina
    Breina over 4 years
    @tresf You can't use quantifiers in square brackets.
  • Ubeogesh
    Ubeogesh over 4 years
    how about this: \v+ (one or more vertical whitespace character)
  • SeverityOne
    SeverityOne over 4 years
    I ended up splitting on \\R+, to avoid any end-of-line characters that were not covered by \\R alone.
  • MichaelMoser
    MichaelMoser over 3 years
    this works for java8 and splits the string into a stream of line strings: Arrays.stream(str.split("\\n"))
  • Pshemo
    Pshemo over 3 years
    JAVA 9 PROBLEM with find matches. Java 9 incorrectly allows regex like \R\R to match sequence \r\n which represents single separation sequence. To solve such problem we can write regex like (?>\u000D\u000A)|[\u000A\u000B\u000C\u000D\u0085\u2028\u2029‌​] which thanks to atomic group (?>\u000D\u000A) will prevent regex which already matched \r\n to backtrack and try to match \r and \n separately.
  • Leponzo
    Leponzo over 2 years
    Are double backslashes necessary?
  • Diablo
    Diablo about 2 years
    not sure if it worked with previous versions of Java. But it'll not work anymore. the split() method expects a regex and the valid regex would be \\\\n
  • vault
    vault about 2 years
    Why is there no such method on Android...