Is there a performance difference between i++ and ++i in C++?

c++ performance oop post-increment pre-increment

96,350

Solution 1

[Executive Summary: Use ++i if you don't have a specific reason to use i++.]

For C++, the answer is a bit more complicated.

If i is a simple type (not an instance of a C++ class), then the answer given for C ("No there is no performance difference") holds, since the compiler is generating the code.

However, if i is an instance of a C++ class, then i++ and ++i are making calls to one of the operator++ functions. Here's a standard pair of these functions:

Foo& Foo::operator++()   // called for ++i
{
    this->data += 1;
    return *this;
}

Foo Foo::operator++(int ignored_dummy_value)   // called for i++
{
    Foo tmp(*this);   // variable "tmp" cannot be optimized away by the compiler
    ++(*this);
    return tmp;
}

Since the compiler isn't generating code, but just calling an operator++ function, there is no way to optimize away the tmp variable and its associated copy constructor. If the copy constructor is expensive, then this can have a significant performance impact.

Solution 2

Yes. There is.

The ++ operator may or may not be defined as a function. For primitive types (int, double, ...) the operators are built in, so the compiler will probably be able to optimize your code. But in the case of an object that defines the ++ operator things are different.

The operator++(int) function must create a copy. That is because postfix ++ is expected to return a different value than what it holds: it must hold its value in a temp variable, increment its value and return the temp. In the case of operator++(), prefix ++, there is no need to create a copy: the object can increment itself and then simply return itself.

Here is an illustration of the point:

struct C
{
    C& operator++();      // prefix
    C  operator++(int);   // postfix

private:

    int i_;
};

C& C::operator++()
{
    ++i_;
    return *this;   // self, no copy created
}

C C::operator++(int ignored_dummy_value)
{
    C t(*this);
    ++(*this);
    return t;   // return a copy
}

Every time you call operator++(int) you must create a copy, and the compiler can't do anything about it. When given the choice, use operator++(); this way you don't save a copy. It might be significant in the case of many increments (large loop?) and/or large objects.

Solution 3

Here's a benchmark for the case when increment operators are in different translation units. Compiler with g++ 4.5.

Ignore the style issues for now

// a.cc
#include <ctime>
#include <array>
class Something {
public:
    Something& operator++();
    Something operator++(int);
private:
    std::array<int,PACKET_SIZE> data;
};

int main () {
    Something s;

    for (int i=0; i<1024*1024*30; ++i) ++s; // warm up
    std::clock_t a = clock();
    for (int i=0; i<1024*1024*30; ++i) ++s;
    a = clock() - a;

    for (int i=0; i<1024*1024*30; ++i) s++; // warm up
    std::clock_t b = clock();
    for (int i=0; i<1024*1024*30; ++i) s++;
    b = clock() - b;

    std::cout << "a=" << (a/double(CLOCKS_PER_SEC))
              << ", b=" << (b/double(CLOCKS_PER_SEC)) << '\n';
    return 0;
}

O(n) increment

Test

// b.cc
#include <array>
class Something {
public:
    Something& operator++();
    Something operator++(int);
private:
    std::array<int,PACKET_SIZE> data;
};


Something& Something::operator++()
{
    for (auto it=data.begin(), end=data.end(); it!=end; ++it)
        ++*it;
    return *this;
}

Something Something::operator++(int)
{
    Something ret = *this;
    ++*this;
    return ret;
}

Results

Results (timings are in seconds) with g++ 4.5 on a virtual machine:

Flags (--std=c++0x)       ++i   i++
-DPACKET_SIZE=50 -O1      1.70  2.39
-DPACKET_SIZE=50 -O3      0.59  1.00
-DPACKET_SIZE=500 -O1    10.51 13.28
-DPACKET_SIZE=500 -O3     4.28  6.82

O(1) increment

Test

Let us now take the following file:

// c.cc
#include <array>
class Something {
public:
    Something& operator++();
    Something operator++(int);
private:
    std::array<int,PACKET_SIZE> data;
};


Something& Something::operator++()
{
    return *this;
}

Something Something::operator++(int)
{
    Something ret = *this;
    ++*this;
    return ret;
}

It does nothing in the incrementation. This simulates the case when incrementation has constant complexity.

Results

Results now vary extremely:

Flags (--std=c++0x)       ++i   i++
-DPACKET_SIZE=50 -O1      0.05   0.74
-DPACKET_SIZE=50 -O3      0.08   0.97
-DPACKET_SIZE=500 -O1     0.05   2.79
-DPACKET_SIZE=500 -O3     0.08   2.18
-DPACKET_SIZE=5000 -O3    0.07  21.90

Conclusion

Performance-wise

If you do not need the previous value, make it a habit to use pre-increment. Be consistent even with builtin types, you'll get used to it and do not run risk of suffering unecessary performance loss if you ever replace a builtin type with a custom type.

Semantic-wise

i++ says increment i, I am interested in the previous value, though.
++i says increment i, I am interested in the current value or increment i, no interest in the previous value. Again, you'll get used to it, even if you are not right now.

Knuth.

Premature optimization is the root of all evil. As is premature pessimization.

Solution 4

It's not entirely correct to say that the compiler can't optimize away the temporary variable copy in the postfix case. A quick test with VC shows that it, at least, can do that in certain cases.

In the following example, the code generated is identical for prefix and postfix, for instance:

#include <stdio.h>

class Foo
{
public:

    Foo() { myData=0; }
    Foo(const Foo &rhs) { myData=rhs.myData; }

    const Foo& operator++()
    {
        this->myData++;
        return *this;
    }

    const Foo operator++(int)
    {
        Foo tmp(*this);
        this->myData++;
        return tmp;
    }

    int GetData() { return myData; }

private:

    int myData;
};

int main(int argc, char* argv[])
{
    Foo testFoo;

    int count;
    printf("Enter loop count: ");
    scanf("%d", &count);

    for(int i=0; i<count; i++)
    {
        testFoo++;
    }

    printf("Value: %d\n", testFoo.GetData());
}

Whether you do ++testFoo or testFoo++, you'll still get the same resulting code. In fact, without reading the count in from the user, the optimizer got the whole thing down to a constant. So this:

for(int i=0; i<10; i++)
{
    testFoo++;
}

printf("Value: %d\n", testFoo.GetData());

Resulted in the following:

00401000  push        0Ah  
00401002  push        offset string "Value: %d\n" (402104h) 
00401007  call        dword ptr [__imp__printf (4020A0h)]

So while it's certainly the case that the postfix version could be slower, it may well be that the optimizer will be good enough to get rid of the temporary copy if you're not using it.

Solution 5

The Google C++ Style Guide says:

Preincrement and Predecrement

Use prefix form (++i) of the increment and decrement operators with iterators and other template objects.

Definition: When a variable is incremented (++i or i++) or decremented (--i or i--) and the value of the expression is not used, one must decide whether to preincrement (decrement) or postincrement (decrement).

Pros: When the return value is ignored, the "pre" form (++i) is never less efficient than the "post" form (i++), and is often more efficient. This is because post-increment (or decrement) requires a copy of i to be made, which is the value of the expression. If i is an iterator or other non-scalar type, copying i could be expensive. Since the two types of increment behave the same when the value is ignored, why not just always pre-increment?

Cons: The tradition developed, in C, of using post-increment when the expression value is not used, especially in for loops. Some find post-increment easier to read, since the "subject" (i) precedes the "verb" (++), just like in English.

Decision: For simple scalar (non-object) values there is no reason to prefer one form and we allow either. For iterators and other template types, use pre-increment.

View more solutions

96,350

Author by

Mark Harrison

I'm a Software Engineer at Google where I work on machine learning planning systems. From 2001-2015 I was the Pixar Tech Lead of the Data Management Group. My 50-year charter was to store and catalog all data and metadata related to the Studio's feature films. This system ("Templar") is in use to this day. From 1997 to 2001 I lived in Beijing, China and was the Chief Software Architect at AsiaInfo, the company that built China's Internet. While there my software was used to grow the China Internet from 200K to 65M users. The last I heard they were at 350M+ users. I studied computer science and worked in Texas for many years. I wrote a couple of computer books... the best one was in print for 20 years. Feel free to drop me a line! [email protected]

Updated on February 10, 2022

Comments

Mark Harrison about 2 years

We have the question is there a performance difference between i++ and ++i in C?

What's the answer for C++?
Blaisorblade over 15 years

You forgot to note the important point that here everything is inlined. If the definitions of the operators is not available, the copy done in the out-of-line code cannot be avoided; with inlining the optim is quite obvious, so any compiler will do it.
Blaisorblade over 15 years

That's not relevant. NRVO avoids the need to copy t in "C C::operator++(int)" back to the caller, but i++ will still copy the old value on the stack of the caller. Without NRVO, i++ creates 2 copies, one to t and one back to the caller.
Blaisorblade over 15 years

What the compiler can avoid is the second copy to return tmp, by allocating tmp in the caller, through NRVO, as mentioned by another comment.
Joe Phillips over 15 years

i++ is one processor instruction slower
Eduard - Gabriel Munteanu about 15 years

Can't the compiler avoid this altogether if operator++ is inlined?
Mike Dunlavey over 14 years

I see you're working on a Ph.D. with interest in compiler optimization and things of that sort. That's great, but don't forget academia is an echo chamber, and common sense often gets left outside the door, at least in C.S. You might be interested in this: stackoverflow.com/questions/1303899/…
Mike Dunlavey over 14 years

Sorry, but that bothers me. Who says it's a "good habit", when it almost never matters? If people want to make it part of their discipline, that's fine, but let's distinguish significant reasons from matters of personal taste.
Zan Lynx over 14 years

Yes if operator++ is inlined and tmp is never used it can be removed unless the tmp object's constructor or destructor has side effects.
Nosredna over 14 years

"Decision: For simple scalar (non-object) values there is no reason to prefer one form and we allow either. For iterators and other template types, use pre-increment."
John Zwinck over 14 years

Mark: "this->data" is in one sample and "this->mydata" in the other...can you fix that up?
kriss about 14 years

C++ behavior is not so different from C's. If you think in terms of memory or register assignements, what C++ shows is what C compiler has to do behind the scene. Also many C++ compilers will use operator++() if no operator++(int) is defined (and issue a warning).
kriss about 14 years

@Motti: (joking) The C++ name is logical if you recall Bjarne Stroustrup C++ initially coded it as a pre-compiler generating a C program. Hence C++ returned an old C value. Or it may be to enhance that C++ is somewhat conceptually flawed from the beginning.
Sebastian Mach about 12 years

I never found ++i more annoying than i++ (in fact, I found it cooler), but the rest of your post gets my full acknowledgement. Maybe add a point "premature optimization is evil, as is premature pessimization"
Blaisorblade about 12 years

@kriss: the difference between C and C++ is that in C you have a guarantee that the operator will be inlined, and at that point a decent optimizer will be able to remove the difference; instead in C++ you can't assume inlining - not always.
phonetagger almost 12 years

I'd +1 IF the answer mentioned something about classes that hold pointers (whether auto, smart, or primitive) to dynamically-allocated (heap) memory, where the copy constructor necessarily performs deep copies. In such cases, there is no argument, ++i is perhaps an order of magnitude more efficient than i++. They key is to get in the habit of using pre-increment whenever post-increment semantics are not actually required by your algorithm, and you'll then be in the habit of writing code that by nature lends itself to greater efficiency, regardless of how well your compiler can optimize.
corazza over 9 years

For anyone who might be confused about the dummy argument in operator++(int), it's just used to distinguish between the two forms. Kind of sloppy if you ask me.
chew socks over 9 years

Interesting test. Now, almost two and a half years later, gcc 4.9 and Clang 3.4 show a similar trend. Clang is a bit faster with both, but the disparity between pre and postfix is worse than gcc.
M.M over 9 years

strncpy served a purpose in the filesystems they were using at the time; the filename was an 8-character buffer and it did not have to be null-terminated. You can't blame them for not seeing 40 years into the future of language evolution.
Blaisorblade over 9 years

@MattMcNabb: weren't 8 characters filename a MS-DOS exclusive? C was invented with Unix. Anyway, even if strncpy had a point, the lack of strlcpy wasn't fully justified: even original C had arrays which you shouldn't overflow, which needed strlcpy; at most, they were only missing attackers intent on exploiting the bugs. But one can't say that forecasting this problem was trivial, so if I rewrote my post I wouldn't use the same tone.
Sebastian Mach over 9 years

Eh, ..., and what is that something?
Wiggler Jtag about 9 years

It's C, while OP asked C++. In C it is the same. In C++ the faster is ++i; due to its object. However some compilers may optimize the post-increment operator.
Keith Thompson about 9 years

@Blaisorblade: As I recall, early UNIX file names were limited to 14 characters. The lack of strlcpy() was justified by the fact that it hadn't been invented yet.
Jakob Schou Jensen almost 9 years

What I would really like to see is a real-world example where ++i / i++ makes a difference. For instance, does it make a difference on any of the std iterators?
Sebastian Mach almost 9 years

@JakobSchouJensen: These were pretty intended to be real world examples. Consider a large application, with complex tree structures (e.g. kd-trees, quad-trees) or large containers used in expression templates (in order to maximise data throughput on SIMD hardware). If it makes a difference there, I am not really sure why one would fallback to post-increment for specific cases if that's not needed semantic-wise.
Jakob Schou Jensen almost 9 years

@phresnel: I don't think operator++ is in your everyday an expression template - do you have an actual example of this? The typical use of operator++ is on integers and iterators. Thats were I think it would be interesting to know if there is any difference (there's no difference on integers of course - but iterators).
Sebastian Mach almost 9 years

@JakobSchouJensen: No actual business example, but some number crunching applications where you count stuff. Wrt iterators, consider a ray tracer that is written in idiomatic C++ style, and you have an iterator for depth-first traversal, such that for (it=nearest(ray.origin); it!=end(); ++it) { if (auto i = intersect(ray, *it)) return i; }, never mind the actual tree structur (BSP, kd, Quadtree, Octree Grid, etc.). Such an iterator would need to maintain some state, e.g. parent node, child node, index and stuff like that. All in all, my stance is, even if only few examples exist, ...
Sebastian Mach almost 9 years

post increment, if you mean it, and use pre increment, if you mean it (but I am sure you are not questioning this :))
Pedro Lamarão almost 8 years

Can anyone comment on this topic if this is std::atomic<int> or similar?
rubenvb almost 7 years

I don't see how in C you can guarantee inlining where in C++ you cannot?
Matthias over 6 years

"The pre increment operator introduces a data dependency in the code: the CPU must wait for the increment operation to be completed before its value can be used in the expression. On a deeply pipelined CPU, this introduces a stall. There is no data dependency for the post increment operator." (Game Engine Architecture (2nd edition)) So if the copy of a post increment is not computationally intensive, it can still beat the pre increment.
rasen58 over 6 years

In the postfix code, how does this work C t(*this); ++(*this); return t; In the second line, you're incrementing the this pointer right, so how does t get updated if you're incrementing this. Weren't the values of this already copied into t?
ShadowRanger about 6 years

@rubenvb: In C, there is no such thing as operator overloading, so ++ (whether pre- or post-) is never a function call; it's not so much that inlining is guaranteed as that there is nothing that is conceivably "out of line" in the first place. Both versions of ++ do a consistent thing: Increment a numeric or pointer type. pre- vs. post- just determines whether it's done before or after the value is used in the expression that contains it, but copying isn't needed; there is always a single consistent value for the variable, with the increment placed before or after point of use.
ShadowRanger about 6 years

C++ can't make that guarantee because operator overloading means function call semantics are in play; the difference between ++x and x++ is which function is called (operator++() or operator++(int)), not when the function is called. If the C++ standard said to define operator++() only, and simply said it would be invoked before or after the expression in which it was found depending on ordering, the C++ would match C's behavior (because you'd only be using a non-copying increment function w/compiler controlled timing), but it doesn't, so you can only avoid the copy when inlined.
ShadowRanger about 6 years

@PedroLamarão: For std::atomic (with integral and pointer types), both increment overloads return by value, not reference (pre- is normally by ref), but they return a non-atomic value, so the "copy" is free. Odds are, both incur identical costs; functionally, they're the same operation (a fetch_add, which returns the original value) where pre-increment returns fetch_add's result plus 1, while post-increment returns fetch_add's result directly. Compared to the overhead of even CPU supported atomic adds, the non-atomic increment of fetch_add's result is going to be lost in the noise.
Puddle over 5 years

@MikeDunlavey ok, so which side do you normally use when it doesn't matter? xD it's either one or the other ain't it! the post++ (if you're using it with the general meaning. update it, return the old) is completely inferior to ++pre (update it, return) there's never any reason you'd want to have less performance. in the case where you'd want to update it after, the programmer won't even do the post++ at all then. no wasting time copying when we already have it. update it after we use it. then the compilers having the common sense you wanted it to have.
Mike Dunlavey over 5 years

@Puddle: When I hear this: "there's never any reason you'd want to have less performance" I know I'm hearing "penny wise - pound foolish". You need to have an appreciation of the magnitudes involved. Only if this accounts for more than 1% of the time involved should you even give it a thought. Usually, if you're thinking about this, there are million-times larger problems you're not considering, and this is what makes software much much slower than it could be.
Puddle over 5 years

@MikeDunlavey regurgitated nonsense to satisfy your ego. you're trying to sound like some all wise monk, yet you're saying nothing. the magnitudes involved... if only over 1% of the time you should care... xD absolute dribble. if it's inefficient, it's worth knowing about and fixing. we're here pondering this for that exact reason! we're not concerned about how much we may gain from this knowledge. and when i said you'd not want less performance, go ahead, explain one damn scenario then. MR WISE!
Severin Pappadeux about 5 years

The operator++(int) function must create a copy. no, it is not. No more copies than operator++()
karol almost 5 years

The mentioned link in the answer is currently broken