charAt() or substring? Which is faster?
Solution 1
As usual: it doesn't matter but if you insist on spending time on micro-optimization or if you really like to optimize for your very special use case, try this:
import org.junit.Assert;
import org.junit.Test;
public class StringCharTest {
// Times:
// 1. Initialization of "s" outside the loop
// 2. Init of "s" inside the loop
// 3. newFunction() actually checks the string length,
// so the function will not be optimized away by the hotstop compiler
@Test
// Fastest: 237ms / 562ms / 2434ms
public void testCacheStrings() throws Exception {
// Cache all possible Char strings
String[] char2string = new String[Character.MAX_VALUE];
for (char i = Character.MIN_VALUE; i < Character.MAX_VALUE; i++) {
char2string[i] = Character.toString(i);
}
for (int x = 0; x < 10000000; x++) {
char[] s = "abcdefg".toCharArray();
for (int i = 0; i < s.length; i++) {
newFunction(char2string[s[i]]);
}
}
}
@Test
// Fast: 1687ms / 1725ms / 3382ms
public void testCharToString() throws Exception {
for (int x = 0; x < 10000000; x++) {
String s = "abcdefg";
for (int i = 0; i < s.length(); i++) {
// Fast: Creates new String objects, but does not copy an array
newFunction(Character.toString(s.charAt(i)));
}
}
}
@Test
// Very fast: 1331 ms/ 1414ms / 3190ms
public void testSubstring() throws Exception {
for (int x = 0; x < 10000000; x++) {
String s = "abcdefg";
for (int i = 0; i < s.length(); i++) {
// The fastest! Reuses the internal char array
newFunction(s.substring(i, i + 1));
}
}
}
@Test
// Slowest: 2525ms / 2961ms / 4703ms
public void testNewString() throws Exception {
char[] value = new char[1];
for (int x = 0; x < 10000000; x++) {
char[] s = "abcdefg".toCharArray();
for (int i = 0; i < s.length; i++) {
value[0] = s[i];
// Slow! Copies the array
newFunction(new String(value));
}
}
}
private void newFunction(String string) {
// Do something with the one-character string
Assert.assertEquals(1, string.length());
}
}
Solution 2
The answer is: it doesn't matter.
Profile your code. Is this your bottleneck?
Solution 3
Does newFunction
really need to take a String
? It would be better if you could make newFunction
take a char
and call it like this:
newFunction(s.charAt(i));
That way, you avoid creating a temporary String object.
To answer your question: It's hard to say which one is more efficient. In both examples, a String
object has to be created which contains only one character. Which is more efficient depends on how exactly String.substring(...)
and Character.toString(...)
are implemented on your particular Java implementation. The only way to find it out is running your program through a profiler and seeing which version uses more CPU and/or more memory. Normally, you shouldn't worry about micro-optimizations like this - only spend time on this when you've discovered that this is the cause of a performance and/or memory problem.
Solution 4
Of the two snippets you've posted, I wouldn't want to say. I'd agree with Will that it almost certainly is irrelevant in the overall performance of your code - and if it's not, you can just make the change and determine for yourself which is fastest for your data with your JVM on your hardware.
That said, it's likely that the second snippet would be better if you converted the String into a char array first, and then performed your iterations over the array. Doing it this way would perform the String overhead once only (converting to the array) instead of every call. Additionally, you could then pass the array directly to the String constructor with some indices, which is more efficient than taking a char out of an array to pass it individually (which then gets turned into a one character array):
String s = "abcdefg";
char[] chars = s.toCharArray();
for(int i = 0; i < chars.length; i++) {
newFunction(String.valueOf(chars, i, 1));
}
But to reinforce my first point, when you look at what you're actually avoiding on each call of String.charAt()
- it's two bounds checks, a (lazy) boolean OR, and an addition. This is not going to make any noticeable difference. Neither is the difference in the String constructors.
Essentially, both idioms are fine in terms of performance (neither is immediately obviously inefficient) so you should not spend any more time working on them unless a profiler shows that this takes up a large amount of your application's runtime. And even then you could almost certainly get more performance gains by restructuring your supporting code in this area (e.g. have newFunction
take the whole string itself); java.lang.String is pretty well optimised by this point.
estacado
Updated on July 09, 2022Comments
-
estacado almost 2 years
I want to go through each character in a String and pass each character of the String as a String to another function.
String s = "abcdefg"; for(int i = 0; i < s.length(); i++){ newFunction(s.substring(i, i+1));}
or
String s = "abcdefg"; for(int i = 0; i < s.length(); i++){ newFunction(Character.toString(s.charAt(i)));}
The final result needs to be a String. So any idea which will be faster or more efficient?
-
estacado over 14 yearsnewFunction really needs to take a string. Apart from single characters, newFunction also handles longer strings as well. And it handles them the same way. I don't want to overload newFunction to take in a char because it does the same thing in both cases.
-
MatBailie over 14 yearsI agree completely that micro-optimisation should be avoided in development until it is found to be necessary. I also think that, as a learning excercise, learning about memory allocations and other 'hidden behaviour' is very important. I'm personally tired of niaive programmers knocking out short code in the belief that short = performant, and unwittingly using highly inefficient algorithms. People who don't learn this = lazy. People who are fixated by this = slow. There's a balance to be struck. In my opinion :)
-
MatBailie over 14 years@estacado: If performance is you driver (as implied by your post) optimise in the right places. Overloading the new function to avoid String overheads -may- be the sensible option depending on what the [char] based version would look like. Contorting your code around the function may be more timeconsuming, less effective, and less maintainable.
-
MatBailie over 14 yearsAs this wil be passed a string you need to change your testing slightly in the first test. {char[] s = "abcdefg".toCharArray();} should be Inside the loop, or even better (to prevent clever optimisation by the JVM, put the whole loop and the .toCharArray(), inside a seperate function). It's important to measure all the initial overheads as well as the loops costs. Especially as performance could realistically tip from one to the other based on string length. So testing various lengths of stings is also important.
-
wds over 14 years
substring
in the current jvm actually uses the original character array as a backing store, while you're initiating a copy. So my gut feeling says substring will actually be faster, as a memcpy will likely be more expensive (depending on how large the string is, larger is better). -
mhaller over 14 yearsMoved "s" inside the loop and added an assert() to prevent JVM optimization of newFunction(). Of course it's slower now, but the relative measurements still are the same. My point is merely that there are possibilities for optimization if the problem is known exactly. The point is not to change which function to use for a certain operation, but to see the operation on a higher level to gain improvements, e.g. by caching
-
Vadzim over 10 yearsNote that since Java 7u6 substring became copying. See stackoverflow.com/questions/14161050/…
-
Peter Mortensen over 6 yearsProfile in what way? For memory usage?