java - after splitting a string, what is the first element in the array?
Solution 1
Consider the split expression ",1,2,3,4".split(",");
What would you expect? Right, an empty-string to start with. In your case you have a 'nothing' in front of the first 'a' as well as one behind it.
Update: comments indicate this explanation is not enough of an explanation (which it may not be)... but, it really is this simple: the engine starts at the beginning of the string, and it looks to see if what's in front of it matches the pattern. If it does, it assigns what's behind it to a new item in the split.
On the first character, it has "" (nothing behind it), and it looks to see if there's "" (the pattern) in front of it. There is, so it creates a "" match.
It then moves on, and it has 'a' behind it, and again, it again has "" in front of it. So the second result is an "a" string.
An interesting observation is that, if you use split("", -1)
you will also get an empty-string result in the last position of the result array.
Edit 2: If I wrack my brains further, and consider this to be an academic exercise (I would not recommend this in real life...) I can think of only one good way to do a regex split()
of a String into a String[]
array with 1 character in each string (as opposed to char[] - which other people have given great answers for....).
String[] chars = str.split("(?<=.)", str.length());
This will look behind each character, in a non-capturing group, and split on that, and then limit the array size to the number of characters (you can leave the str.length()
out, but if you put -1
you will get an extra space at the end)
Borrowing nitro2k01's alternative (below in the comments) which references the string beginning and end, you can split reliably on:
String[] chars = str.split("(?!(^|$))");
Solution 2
You can just use the built in java method from the string class. myString.toCharArray()
the empty string is being stored at index 0
Related videos on Youtube
user2994814
Updated on September 23, 2022Comments
-
user2994814 over 1 year
I was trying to split a string into an array of single letters. Here's what I did,
String str = "abcddadfad"; System.out.println(str.length()); // output: 10 String[] strArr = str.split(""); System.out.println(strArr.length); // output: 11 System.out.println(strArr[0]); // output is nothing
The new array did contain all the letters, however it has nothing at index 0,not even a white space, but still incremented the size of my array. Can anyone explain why this is happening?
-
BakuriuI find it quite counter-intuitive that you can use an empty separator. Because you put any number of empty separators wherever you want making (almost) all array lengths equally valid. The fact that the implementation somehow chooses the "minimum" length doesn't change the fact that this operation doesn't make much sense. Raising a "NoEmptySeparator" exception would have been more appropriate in my opinion.
-
-
justhalf over 10 yearsYou can improve this answer by saying that: "If you just want to split a String into an array of characters, you can just do "myString.toCharArray()", and there will be no empty string in the begninning of the array, and it's also simpler"
-
nitro2k01 over 10 yearsWhile the answer addresses what the OP wants to achieve, it doesn't answer the question that was asked.
-
mangr3n over 10 yearsIt doesn't explain how "" works as a regular expression which is at issue here. I've done some regex stuff, and have never tried any kind of matching with "", such that I understand how it works exactly. Someone who has, or who knows the java regex code internally might be able to explain this better.
-
Floris over 10 yearsThis seems to be a pretty clear explanation. "Split the string when you encounter nothing, then go to the next character". Note - that second part is important. You don't get an infinite array of empty strings; only the first element returned is nothing, after that the
split
algorithm increments by at least one. But not the first time. Still a bit odd... -
user2994814 over 10 yearsIf I want to stick with the split() function, is there any way that I can modify the code to circumvent the problem?
-
mangr3n over 10 yearsNo, because the only regex I can think of that works, "", also matches the empty string on the front end. You have to account for it, not "fix" it. The most efficient (performance) way is toCharArray().
-
Floris over 10 yearsI wonder if a look around expression works in this context. I can't test that unfortunately.
-
rolfl over 10 yearsInteresting, yes, it explicitly checks for not-start-of-line but also has the -1 split vulnerability ... 50/50 as to which option is better
-
nitro2k01 over 10 yearsWell, if you want to get into the silly territory, you could use
"(?!(^|$))"
. But yeeeah. -
rolfl over 10 yearsUpdated my answer again, @nitro2k01's offering will reliably split it as the OP originally intended.
-
ratchet freak over 10 yearsit's easier (and more efficient) to use
strArr[i] = substring(i,i+1);
-
nitro2k01 over 10 yearsWell, the OP is better off using
str.toCharArray()
for performance reasons, but that's not what the question actually asked.