Multiple nested for loops vs single for loop
Solution 1
The compiler won't "optimize" the loops away when the testX variable is used somewhere later in the code. When I just add one line to the code to output testX, the results are as follows:
single for loop: 1.218 ms
nested for loop: 1.218 ms
This pretty much shows that the compiler converts the nested loop into a single loop, whenever possible. The loop index can be used to prevent that optimisation:
Modifying the code this way
for( int i = 0; i < 27000000; i++ ){
testX += i;
}
and
for( int x = 0; x < 300; x++ ){
testX += x;
for( int y = 0; y < 300; y++ ){
testX += y;
for( int z = 0; z < 300; z++ ){
testX += z;
}
}
}
will add a little overhead to the nested loop but the execution time shows
single for loop: 1.224 ms
nested for loop: 1.226 ms
The times given here are averaged over 30.000 loop runs.
Note: The testX += x;
only contributes 1 in 90000 and the testX += x;
only contributes 1 in 300. Thus the two sections above remain comparable.
Nested loops are not much slower than single loops but your observation that they are faster is not true.
And: The times you show are about 40 times the times I observed. I'd suggest to carefully inspect the compiler settings since I ran the test on a medium speed hardware. Maybe the results of glfwGetTime()
are questionable and this is the main reason for your question. Have you tried to use another timing scheme?
Edit: In order to prevent compiler optimisation the loop limit can be choosen to be non constant:
int lmt = rand() % 1 + 300; // random value 300 or 301
int big_lmt = lmt * lmt * lmt; // random value 27000000 or 27270901
for( int i = 0; i < big_lmt; i++ ){
testX += i;
}
for( int x = 0; x < lmt; x++ ){
testX += x;
for( int y = 0; y < lmt; y++ ){
testX += y;
for( int z = 0; z < lmt; z++ ){
testX += z;
}
}
}
This avoids compiler predictability.
Results (for a lmt = 300
case to be comparable):
single for loop: 1.213 ms
nested for loop: 1.216 ms
Result:
- Nested loops are not faster than single loops.
Solution 2
If you don't use your for
variables (x
,y
,z
) inside your for
loop, a smart compiler could (and should) convert your second form in a single for
loop without nesting. Unless you prevent such compiler optimization removing static predictability by having the user input the x
,y
,z
values at runtime from stdin
, or reading from some stream, etc..
Furthermore, if you don't do something with your testX
variable (say, printing it to stdout
), a smart compiler could (and should) optimize it away, i.e. remove the dead code altogether.
So what I'm saying is that the benchmark, as it is right now, is somehow ill-formed.
Jamie Syme
Updated on June 04, 2022Comments
-
Jamie Syme almost 2 years
I was doing a little speed testing in c++ (MSVS) and got a strange result. I was testing the speed of using one for loop vs multiple nested for loops. Here is the code:
double testX = 0; // Single loop executes in roughly 0.04 seconds for( int i = 0; i < 27000000; i++ ){ testX += 1; } // Nested loop executes in roughly 0.03 seconds for( int x = 0; x < 300; x++ ){ for( int y = 0; y < 300; y++ ){ for( int z = 0; z < 300; z++ ){ testX += 1; } } }
As you can see, the speed difference is fairly obvious. I have run this many times, and those are the average times I am seeing (these are timed using glfwGetTime()).
So my question is: why? Is my test method inadequate? Am I using too few loops? I have tried searching google, and the only similar question I could find related his problem to cache coherency, but since these are empty for loops, I didn't think it would really have an effect.
Any help is appriciated :)
Edit: Thanks to the comments, I realized that using empty for loops probably wasn't the best way of testing things... So I have updated my code to do some (very) simple operations to a double. I also am compiling in release mode. However, though the two methods are a lot more similar in times, the second method is still slightly faster.
And yes, this is all the test code (minus the timing/output functions, but those aren't exactly specific to the question).