Split a string, at every nth position
Solution 1
For a big performance improvement, an alternative would be to use substring()
in a loop:
public String[] splitStringEvery(String s, int interval) {
int arrayLength = (int) Math.ceil(((s.length() / (double)interval)));
String[] result = new String[arrayLength];
int j = 0;
int lastIndex = result.length - 1;
for (int i = 0; i < lastIndex; i++) {
result[i] = s.substring(j, j + interval);
j += interval;
} //Add the last bit
result[lastIndex] = s.substring(j);
return result;
}
Example:
Input: String st = "1231241251341351452342352456"
Output: 123 124 125 134 135 145 234 235 245 6.
It's not as short as stevevls' solution, but it's way more efficient (see below) and I think it would be easier to adjust in the future, of course depending on your situation.
Performance tests (Java 7u45)
2,000 characters long string - interval is 3.
split("(?<=\\G.{" + count + "})")
performance (in miliseconds):
7, 7, 5, 5, 4, 3, 3, 2, 2, 2
splitStringEvery()
(substring()
) performance (in miliseconds):
2, 0, 0, 0, 0, 1, 0, 1, 0, 0
2,000,000 characters long string - interval is 3.
split()
performance (in miliseconds):
207, 95, 376, 87, 97, 83, 83, 82, 81, 83
splitStringEvery()
performance (in miliseconds):
44, 20, 13, 24, 13, 26, 12, 38, 12, 13
2,000,000 characters long string - interval is 30.
split()
performance (in miliseconds):
103, 61, 41, 55, 43, 44, 49, 47, 47, 45
splitStringEvery()
performance (in miliseconds):
7, 7, 2, 5, 1, 3, 4, 4, 2, 1
Conclusion:
The splitStringEvery()
method is a lot faster (even after the changes in Java 7u6), and it escalates when the intervals become higher.
Ready-to-use Test Code:
Solution 2
You can use the brace operator to specify the number of times a character must occur:
String []thisCombo2 = thisCombo.split("(?<=\\G.{" + count + "})");
The brace is a handy tool because you can use it to specify either an exact count or ranges.
Solution 3
Using Google Guava, you can use Splitter.fixedLength()
Returns a splitter that divides strings into pieces of the given length
Splitter.fixedLength(2).split("abcde");
// returns an iterable containing ["ab", "cd", "e"].
Emile Beukes
Updated on April 05, 2020Comments
-
Emile Beukes about 4 years
I use this regex to split a string at every say 3rd position:
String []thisCombo2 = thisCombo.split("(?<=\\G...)");
where the 3 dots after the G indicates every nth position to split. In this case, the 3 dots indicate every 3 positions. An example:
Input: String st = "123124125134135145234235245" Output: 123 124 125 134 135 145 234 235 245.
My question is, how do i let the user control the number of positions where the string must be split at? In other words, how do I make those 3 dots, n dots controlled by the user?
-
thedayturns over 11 yearsIsn't this just premature optimization?
-
Aske B. over 11 years@thedayturns Why are you posting that statement with a question mark? Don't be unsure of your accusations. It's one of those accusations that should be used against people who waste their time with unnecessary performance improvements. Anyway, this is fastly written, ready-to-use code; easier to understand, to me at least; and on the plus side, it runs e.g. 60 times faster in the last case (it grows exponentially with the interval). My whole performance research act may be unnecessary, but now it's there for generations to come.
-
thedayturns over 11 yearsGood response. I thought about it, and I think you're right - the highest voted answer is probably even more confusing than this one. On the other hand, the google guava solution is better than both you're fine with including another library.
-
Aske B. over 11 years@thedayturns If you mean "if you're fine with including another library" then I agree. It's a very elegant solution, but I don't think it's the majority that wants to include an external library just for one functionality.
-
thedayturns over 11 yearsYep. Caught my typo after the 5 minute deadline, whoops.
-
Dennis Meng over 10 yearsWith the recent changes to substring's performance, I wonder if this is still fastest. Has anyone tried comparing these using Java 7 instead of Java 6?
-
Aske B. over 10 years@DennisMeng I just tested it out, using the test code I provided, and it has slightly different results. I'll update the results to the answer. Regardless, I would be surprised if the substring would ever become bad enough to match using regex.
-
Zout about 8 yearsI think you should check that the input string is non-empty - otherwise you will access
result[-1]
and get anArrayIndexOutOfBoundsException
on the "add the last bit" line for an empty string.. -
Aske B. about 8 years@Zout You are right. You could also get a
NullPointerException
if the string is null. Probably also some weird behavior if theinterval
is0
or negative. I think this is far beyond what the OP asked though. Defensive programming can be good in some circumstances, but it's not necessary in most cases. Hopefully people will figure out what they need in their own case. Or seek the knowledge about how to handle this in respective questions. -
Zout about 8 yearsIn my use case, the string was coming from user input, but the interval was predefined (so we can guarantee the string is non-null and that the interval is > 0) so I think it would make sense to check for isEmpty. I can imagine other situations where the interval would also be user defined, so I see your point.