Difference between declaring variables before or in loop?

java performance loops variables initialization

142,988

Solution 1

Which is better, a or b?

From a performance perspective, you'd have to measure it. (And in my opinion, if you can measure a difference, the compiler isn't very good).

From a maintenance perspective, b is better. Declare and initialize variables in the same place, in the narrowest scope possible. Don't leave a gaping hole between the declaration and the initialization, and don't pollute namespaces you don't need to.

Solution 2

Well I ran your A and B examples 20 times each, looping 100 million times.(JVM - 1.5.0)

A: average execution time: .074 sec

B: average execution time : .067 sec

To my surprise B was slightly faster. As fast as computers are now its hard to say if you could accurately measure this. I would code it the A way as well but I would say it doesn't really matter.

Solution 3

It depends on the language and the exact use. For instance, in C# 1 it made no difference. In C# 2, if the local variable is captured by an anonymous method (or lambda expression in C# 3) it can make a very signficant difference.

Example:

using System;
using System.Collections.Generic;

class Test
{
    static void Main()
    {
        List<Action> actions = new List<Action>();

        int outer;
        for (int i=0; i < 10; i++)
        {
            outer = i;
            int inner = i;
            actions.Add(() => Console.WriteLine("Inner={0}, Outer={1}", inner, outer));
        }

        foreach (Action action in actions)
        {
            action();
        }
    }
}

Output:

Inner=0, Outer=9
Inner=1, Outer=9
Inner=2, Outer=9
Inner=3, Outer=9
Inner=4, Outer=9
Inner=5, Outer=9
Inner=6, Outer=9
Inner=7, Outer=9
Inner=8, Outer=9
Inner=9, Outer=9

The difference is that all of the actions capture the same outer variable, but each has its own separate inner variable.

Solution 4

The following is what I wrote and compiled in .NET.

double r0;
for (int i = 0; i < 1000; i++) {
    r0 = i*i;
    Console.WriteLine(r0);
}

for (int j = 0; j < 1000; j++) {
    double r1 = j*j;
    Console.WriteLine(r1);
}

This is what I get from .NET Reflector when CIL is rendered back into code.

for (int i = 0; i < 0x3e8; i++)
{
    double r0 = i * i;
    Console.WriteLine(r0);
}
for (int j = 0; j < 0x3e8; j++)
{
    double r1 = j * j;
    Console.WriteLine(r1);
}

So both look exactly same after compilation. In managed languages code is converted into CL/byte code and at time of execution it's converted into machine language. So in machine language a double may not even be created on the stack. It may just be a register as code reflect that it is a temporary variable for WriteLine function. There are a whole set optimization rules just for loops. So the average guy shouldn't be worried about it, especially in managed languages. There are cases when you can optimize manage code, for example, if you have to concatenate a large number of strings using just string a; a+=anotherstring[i] vs using StringBuilder. There is very big difference in performance between both. There are a lot of such cases where the compiler cannot optimize your code, because it cannot figure out what is intended in a bigger scope. But it can pretty much optimize basic things for you.

Solution 5

This is a gotcha in VB.NET. The Visual Basic result won't reinitialize the variable in this example:

For i as Integer = 1 to 100
    Dim j as Integer
    Console.WriteLine(j)
    j = i
Next

' Output: 0 1 2 3 4...

This will print 0 the first time (Visual Basic variables have default values when declared!) but i each time after that.

If you add a = 0, though, you get what you might expect:

For i as Integer = 1 to 100
    Dim j as Integer = 0
    Console.WriteLine(j)
    j = i
Next

'Output: 0 0 0 0 0...

View more solutions

142,988

Author by

Rabarberski

Updated on April 16, 2021

Comments

Rabarberski about 3 years
I have always wondered if, in general, declaring a throw-away variable before a loop, as opposed to repeatedly inside the loop, makes any (performance) difference? A (quite pointless) example in Java:

a) declaration before loop:
```
double intermediateResult;
for(int i=0; i < 1000; i++){
    intermediateResult = i;
    System.out.println(intermediateResult);
}
```
b) declaration (repeatedly) inside loop:
```
for(int i=0; i < 1000; i++){
    double intermediateResult = i;
    System.out.println(intermediateResult);
}
```
Which one is better, a or b?

I suspect that repeated variable declaration (example b) creates more overhead in theory, but that compilers are smart enough so that it doesn't matter. Example b has the advantage of being more compact and limiting the scope of the variable to where it is used. Still, I tend to code according example a.

Edit: I am especially interested in the Java case.
Jon Skeet over 15 years

Do you conceptually want the variable to live for the duration of the loop instead of separately per iteration? I rarely do. Write code which reveals your intention as clearly as possible, unless you've got a very, very good reason to do otherwise.
Rabarberski over 15 years

Ah, nice compromise, I never thought of this! IMO, the code does become a bit less visually 'clear' though)
Mark Davidson over 15 years

You beat me I was just about to post my results for profiling, I got more or less the same and yes surprisingly B is faster really would have thought A if I had needed to bet on it.
Mark over 15 years

Ok cool, yeah i only looked at execution time, as R. Bemrose pointed out in A the variable sticks around after the loop has completed. Did you profile results tell you anything about memory usage ?
Admin over 15 years

Not much surprise - when variable is local to the loop, it does not need to be preserved after each iteration, so it can stay in a register.
Kostis over 15 years

I've been using VB.NET for years and hadn't come across this!!
Kenan Banks over 15 years

@Jon - I have no idea what the OP is actually doing with the intermediate value. Just thought it was an option worth considering.
Michael Haren over 15 years

Yes, it's unpleasant to figure this out in practice.
ferventcoder over 15 years

Here is a reference about this from Paul Vick: panopticoncentral.net/archive/2006/03/28/11552.aspx
MGOwen almost 14 years

+1 for actually testing it, not just an opinion/theory the OP could have made up himself.
Ajoy Bhatia over 13 years

A new reference is not allocated for each object, even if the the reference is declared within the 'for'-loop. In BOTH cases: 1) 'o' is a local variable and stack space is allocated once for it at the start of the function. 2) There is a new Object created in each iteration. So there is no difference in performance. For code organization, readability and maintainability, declaring the reference within the loop is better.
Jesse C. Slicer over 13 years

While I can't speak for Java, in .NET the reference is not 'allocated' for each object in the first example. There is a single entry on the stack for that local (to the method) variable. For your examples, the IL created is identical.
Mark Hurd almost 13 years

@eschneider @ferventcoder Unfortunately @PaulV has decided to drop his old blog posts, so this is now a dead link.
Admin almost 13 years

yea, just recently ran across this; was looking for some official docs on this...
new123456 over 12 years

sticks around after your loop is finished - although this doesn't matter in a language like Python, where bound names stick around until the function ends.
Powerlord over 12 years

@new123456: The OP asked for Java specifics, even if the question was asked somewhat generically. Many C-derived languages have block-level scoping: C, C++, Perl (with the my keyword), C#, and Java to name 5 I've used.
new123456 over 12 years

I know - it was an observation, not a criticism.
smile.al.d.way over 12 years

yep it doesn't dim the variable again. ran into this yesterday and series of googling let to this.
Royi Namir almost 12 years

in example B (original question), does it actually creates a new variable each time ? what happening in the eyes of the stack ?
Philip Guin over 11 years

What's the execution time after JIT kicks in?
javatarz almost 11 years

@GoodPerson to be honest, I'd like that to be done. I ran this test around 10 times on my machine for 50,000,000-100,000,000 iterations with almost an identical piece of code (that I would love to share with anyone who wants to run stats). The answers were split almost equally either way usually by a margin of 900ms (over 50M iterations) which isn't really much. Though my first thought is that it's going to be "noise", it might lean one by just a bit. This effort seems purely academic to me though (for most real life applications).. I'd love to see a result anyway ;) Anyone agree?
Anto Varghese about 10 years

Instead of Double, if it deals with String, still the case "b" better?
Mark almost 10 years

@Wolf You don't understand, I'm not saying one is better then the other. I'm just agreeing with Rabarberski that I tend to also code example 'A' way.
Wolf almost 10 years

Ok, then sorry about the downvote. Do you think, it's possible to improve your answer (my vote is locked now)? BTW: really measuring instead of guessing is worth to be +1ed. But from the developer's performance POV, I'd tend to use the B version, if there is no evidence of a performance hit (normally loop bodies are not such trivial).
nawfal almost 10 years

@Jon, was it a bug in C# 1.0? Shouldn't ideally Outer be 9?
Jon Skeet almost 10 years

@nawfal: I don't know what you mean. Lambda expressions weren't in 1.0... and Outer is 9. What bug do you mean?
Jon Skeet almost 10 years

@nawfal: My point is that there weren't any language features in C# 1.0 where you could tell the difference between declaring a variable inside a loop and declaring it outside (assuming that both compiled). That changed in C# 2.0. No bug.
nawfal almost 10 years

@JonSkeet Oh yes, I get you now, I completely overlooked the fact that you cant close over variables like that in 1.0, my bad! :)
Daniel Earwicker about 9 years

@Antoops - yes, b is better for reasons that have nothing to do with the data type of the variable being declared. Why would it be different for Strings?
Grantly over 8 years

lol I disagree for so many reasons...However, no down vote... I respect your right to choose
Tomasz Przychodzki about 8 years

Must have been Debug unoptimized compilation then, huh?
user137717 about 8 years

yea, and it's cool that you did this, but this comes back to what people were saying about the language / compiler dependence. I wonder how JIT or interpreted language performance would be affected.
user137717 about 8 years

Yea, but that doesn't amount to much. I ran a simple test with a for loop executing 100 million times and I found that the biggest difference in favor of declaring outside the loop was 8 ms. It was usually more like 3-4 and occasionally declaring outside the loop performed WORSE (up to 4 ms), but that was not typical.
Admin almost 8 years

This doesn't make any sense. It would be better if you provided a statistic. Why on the earth should A be slower than B!?
Leo over 7 years

Bet my finest pound sterling 99% upvoted w/o checking.
Ted Hopp over 7 years

Unless you used a good microprofiling harness, there are all sorts of reasons that A performed better than B that are unrelated to the code differences. Did you try running the tests in reverse order? Did you run warm-up cycles before starting the timed tests? How did you handle the I/O times and did you account for that having its own (sometimes quite substantial) variability?
Holger about 7 years

Showing test results without documenting the setup, is worthless. That’s especially true in this case, where both code fragments produce identical bytecode, so any measured difference is just a sign of insufficient test conditions.
luka over 6 years

int j = 0 for (; j < 0x3e8; j++) in this way declared once time both variable, and not each for cycle. 2) the assignment it's fatser thean all other option. 3) So the bestpractice rule is any declaration outside the iteration for.
luka over 6 years

int j = 0 for (; j < 0x3e8; j++) in this way declared once time both variable, and not each for cycle. 2) the assignment it's fatser thean all other option. 3) So the bestpractice rule is any declaration outside the iteration for.
Collin Bell over 5 years

@TedHopp. Your comment stated A performed better than B when in reality B performed better than A.
Ted Hopp over 5 years

@CollinBell - Yes, I got that wrong. However, it makes no difference to the point I was trying to make in my comment.