Java String's split method ignores empty substrings
Solution 1
Use String.split(String regex, int limit)
with negative limit (e.g. -1).
"aa,bb,cc,dd,,,,".split(",", -1)
When String.split(String regex)
is called, it is called with limit
= 0, which will remove all trailing empty strings in the array (in most cases, see below).
The actual behavior of String.split(String regex)
is quite confusing:
- Splitting an empty string will result in an array of length 1. Empty string split will always result in length 1 array containing the empty string.
- Splitting
";"
or";;;"
withregex
being";"
will result in an empty array. Non-empty string split will result in all trailing empty strings in the array removed.
The behavior above can be observed from at least Java 5 to Java 8.
There was an attempt to change the behavior to return an empty array when splitting an empty string in JDK-6559590. However, it was soon reverted in JDK-8028321 when it causes regression in various places. The change never makes it into the initial Java 8 release.
Solution 2
You can use public String[] split(String regex, int limit)
:
The
limit
parameter controls the number of times the pattern is applied and therefore affects the length of the resulting array. If the limit n is greater than zero then the pattern will be applied at most n - 1 times, the array's length will be no greater than n, and the array's last entry will contain all input beyond the last matched delimiter. If n is non-positive then the pattern will be applied as many times as possible and the array can have any length. If n is zero then the pattern will be applied as many times as possible, the array can have any length, and trailing empty strings will be discarded.
String st = "aa,bb,cc,dd,,,,";
System.out.println(Arrays.deepToString(st.split(",",-1)));
↑
Prints:
[aa, bb, cc, dd, , , , ]
Comments
-
Sachin Verma almost 2 years
It occured to me today the behavior of java
String.split()
is very strange.Actually I want to split a string
"aa,bb,cc,dd,,,ee"
to array by.split(",")
that gives me a String array["aa","bb","cc","dd","","","ee"]
of length 7.But when I try to split a String
"aa,bb,cc,dd,,,,"
to array this gives me a array of length 4 means only["aa","bb","cc","dd"]
rejecting all next blank Strings.I want a procedure that splits a String like
"aa,bb,cc,dd,,,,"
to array["aa","bb","cc","dd","","",""]
.Is this possible with java.lang.String api? Thanks in advance.
-
AlexC over 6 yearsThis is not working for me: java -version openjdk version "1.8.0_131" OpenJDK Runtime Environment (build 1.8.0_131-8u131-b11-2ubuntu1.16.04.3-b11)
-
Andy Hayden over 6 yearsThis is wild...!