How do exceptions work (behind the scenes) in c++

c++ performance exception throw try-catch

33,733

Solution 1

Instead of guessing, I decided to actually look at the generated code with a small piece of C++ code and a somewhat old Linux install.

class MyException
{
public:
    MyException() { }
    ~MyException() { }
};

void my_throwing_function(bool throwit)
{
    if (throwit)
        throw MyException();
}

void another_function();
void log(unsigned count);

void my_catching_function()
{
    log(0);
    try
    {
        log(1);
        another_function();
        log(2);
    }
    catch (const MyException& e)
    {
        log(3);
    }
    log(4);
}

I compiled it with g++ -m32 -W -Wall -O3 -save-temps -c, and looked at the generated assembly file.

    .file   "foo.cpp"
    .section    .text._ZN11MyExceptionD1Ev,"axG",@progbits,_ZN11MyExceptionD1Ev,comdat
    .align 2
    .p2align 4,,15
    .weak   _ZN11MyExceptionD1Ev
    .type   _ZN11MyExceptionD1Ev, @function
_ZN11MyExceptionD1Ev:
.LFB7:
    pushl   %ebp
.LCFI0:
    movl    %esp, %ebp
.LCFI1:
    popl    %ebp
    ret
.LFE7:
    .size   _ZN11MyExceptionD1Ev, .-_ZN11MyExceptionD1Ev

_ZN11MyExceptionD1Ev is MyException::~MyException(), so the compiler decided it needed a non-inline copy of the destructor.

.globl __gxx_personality_v0
.globl _Unwind_Resume
    .text
    .align 2
    .p2align 4,,15
.globl _Z20my_catching_functionv
    .type   _Z20my_catching_functionv, @function
_Z20my_catching_functionv:
.LFB9:
    pushl   %ebp
.LCFI2:
    movl    %esp, %ebp
.LCFI3:
    pushl   %ebx
.LCFI4:
    subl    $20, %esp
.LCFI5:
    movl    $0, (%esp)
.LEHB0:
    call    _Z3logj
.LEHE0:
    movl    $1, (%esp)
.LEHB1:
    call    _Z3logj
    call    _Z16another_functionv
    movl    $2, (%esp)
    call    _Z3logj
.LEHE1:
.L5:
    movl    $4, (%esp)
.LEHB2:
    call    _Z3logj
    addl    $20, %esp
    popl    %ebx
    popl    %ebp
    ret
.L12:
    subl    $1, %edx
    movl    %eax, %ebx
    je  .L16
.L14:
    movl    %ebx, (%esp)
    call    _Unwind_Resume
.LEHE2:
.L16:
.L6:
    movl    %eax, (%esp)
    call    __cxa_begin_catch
    movl    $3, (%esp)
.LEHB3:
    call    _Z3logj
.LEHE3:
    call    __cxa_end_catch
    .p2align 4,,3
    jmp .L5
.L11:
.L8:
    movl    %eax, %ebx
    .p2align 4,,6
    call    __cxa_end_catch
    .p2align 4,,6
    jmp .L14
.LFE9:
    .size   _Z20my_catching_functionv, .-_Z20my_catching_functionv
    .section    .gcc_except_table,"a",@progbits
    .align 4
.LLSDA9:
    .byte   0xff
    .byte   0x0
    .uleb128 .LLSDATT9-.LLSDATTD9
.LLSDATTD9:
    .byte   0x1
    .uleb128 .LLSDACSE9-.LLSDACSB9
.LLSDACSB9:
    .uleb128 .LEHB0-.LFB9
    .uleb128 .LEHE0-.LEHB0
    .uleb128 0x0
    .uleb128 0x0
    .uleb128 .LEHB1-.LFB9
    .uleb128 .LEHE1-.LEHB1
    .uleb128 .L12-.LFB9
    .uleb128 0x1
    .uleb128 .LEHB2-.LFB9
    .uleb128 .LEHE2-.LEHB2
    .uleb128 0x0
    .uleb128 0x0
    .uleb128 .LEHB3-.LFB9
    .uleb128 .LEHE3-.LEHB3
    .uleb128 .L11-.LFB9
    .uleb128 0x0
.LLSDACSE9:
    .byte   0x1
    .byte   0x0
    .align 4
    .long   _ZTI11MyException
.LLSDATT9:

Surprise! There are no extra instructions at all on the normal code path. The compiler instead generated extra out-of-line fixup code blocks, referenced via a table at the end of the function (which is actually put on a separate section of the executable). All the work is done behind the scenes by the standard library, based on these tables (_ZTI11MyException is typeinfo for MyException).

OK, that was not actually a surprise for me, I already knew how this compiler did it. Continuing with the assembly output:

    .text
    .align 2
    .p2align 4,,15
.globl _Z20my_throwing_functionb
    .type   _Z20my_throwing_functionb, @function
_Z20my_throwing_functionb:
.LFB8:
    pushl   %ebp
.LCFI6:
    movl    %esp, %ebp
.LCFI7:
    subl    $24, %esp
.LCFI8:
    cmpb    $0, 8(%ebp)
    jne .L21
    leave
    ret
.L21:
    movl    $1, (%esp)
    call    __cxa_allocate_exception
    movl    $_ZN11MyExceptionD1Ev, 8(%esp)
    movl    $_ZTI11MyException, 4(%esp)
    movl    %eax, (%esp)
    call    __cxa_throw
.LFE8:
    .size   _Z20my_throwing_functionb, .-_Z20my_throwing_functionb

Here we see the code for throwing an exception. While there was no extra overhead simply because an exception might be thrown, there is obviously a lot of overhead in actually throwing and catching an exception. Most of it is hidden within __cxa_throw, which must:

Walk the stack with the help of the exception tables until it finds a handler for that exception.
Unwind the stack until it gets to that handler.
Actually call the handler.

Compare that with the cost of simply returning a value, and you see why exceptions should be used only for exceptional returns.

To finish, the rest of the assembly file:

    .weak   _ZTI11MyException
    .section    .rodata._ZTI11MyException,"aG",@progbits,_ZTI11MyException,comdat
    .align 4
    .type   _ZTI11MyException, @object
    .size   _ZTI11MyException, 8
_ZTI11MyException:
    .long   _ZTVN10__cxxabiv117__class_type_infoE+8
    .long   _ZTS11MyException
    .weak   _ZTS11MyException
    .section    .rodata._ZTS11MyException,"aG",@progbits,_ZTS11MyException,comdat
    .type   _ZTS11MyException, @object
    .size   _ZTS11MyException, 14
_ZTS11MyException:
    .string "11MyException"

The typeinfo data.

    .section    .eh_frame,"a",@progbits
.Lframe1:
    .long   .LECIE1-.LSCIE1
.LSCIE1:
    .long   0x0
    .byte   0x1
    .string "zPL"
    .uleb128 0x1
    .sleb128 -4
    .byte   0x8
    .uleb128 0x6
    .byte   0x0
    .long   __gxx_personality_v0
    .byte   0x0
    .byte   0xc
    .uleb128 0x4
    .uleb128 0x4
    .byte   0x88
    .uleb128 0x1
    .align 4
.LECIE1:
.LSFDE3:
    .long   .LEFDE3-.LASFDE3
.LASFDE3:
    .long   .LASFDE3-.Lframe1
    .long   .LFB9
    .long   .LFE9-.LFB9
    .uleb128 0x4
    .long   .LLSDA9
    .byte   0x4
    .long   .LCFI2-.LFB9
    .byte   0xe
    .uleb128 0x8
    .byte   0x85
    .uleb128 0x2
    .byte   0x4
    .long   .LCFI3-.LCFI2
    .byte   0xd
    .uleb128 0x5
    .byte   0x4
    .long   .LCFI5-.LCFI3
    .byte   0x83
    .uleb128 0x3
    .align 4
.LEFDE3:
.LSFDE5:
    .long   .LEFDE5-.LASFDE5
.LASFDE5:
    .long   .LASFDE5-.Lframe1
    .long   .LFB8
    .long   .LFE8-.LFB8
    .uleb128 0x4
    .long   0x0
    .byte   0x4
    .long   .LCFI6-.LFB8
    .byte   0xe
    .uleb128 0x8
    .byte   0x85
    .uleb128 0x2
    .byte   0x4
    .long   .LCFI7-.LCFI6
    .byte   0xd
    .uleb128 0x5
    .align 4
.LEFDE5:
    .ident  "GCC: (GNU) 4.1.2 (Ubuntu 4.1.2-0ubuntu4)"
    .section    .note.GNU-stack,"",@progbits

Even more exception handling tables, and assorted extra information.

So, the conclusion, at least for GCC on Linux: the cost is extra space (for the handlers and tables) whether or not exceptions are thrown, plus the extra cost of parsing the tables and executing the handlers when an exception is thrown. If you use exceptions instead of error codes, and an error is rare, it can be faster, since you do not have the overhead of testing for errors anymore.

In case you want more information, in particular what all the __cxa_ functions do, see the original specification they came from:

Itanium C++ ABI

Solution 2

Exceptions being slow was true in the old days.
In most modern compiler this no longer holds true.

Note: Just because we have exceptions does not mean we do not use error codes as well. When error can be handled locally use error codes. When errors require more context for correction use exceptions: I wrote it much more eloquently here: What are the principles guiding your exception handling policy?

The cost of exception handling code when no exceptions are being used is practically zero.

When an exception is thrown there is some work done.
But you have to compare this against the cost of returning error codes and checking them all the way back to to point where the error can be handled. Both more time consuming to write and maintain.

Also there is one gotcha for novices:
Though Exception objects are supposed to be small some people put lots of stuff inside them. Then you have the cost of copying the exception object. The solution there is two fold:

Don't put extra stuff in your exception.
Catch by const reference.

In my opinion I would bet that the same code with exceptions is either more efficient or at least as comparable as the code without the exceptions (but has all the extra code to check function error results). Remember you are not getting anything for free the compiler is generating the code you should have written in the first place to check error codes (and usually the compiler is much more efficient than a human).

Solution 3

There are a number of ways you could implement exceptions, but typically they will rely on some underlying support from the OS. On Windows this is the structured exception handling mechanism.

There is decent discussion of the details on Code Project: How a C++ compiler implements exception handling

The overhead of exceptions occurs because the compiler has to generate code to keep track of which objects must be destructed in each stack frame (or more precisely scope) if an exception propagates out of that scope. If a function has no local variables on the stack that require destructors to be called then it should not have a performance penalty wrt exception handling.

Using a return code can only unwind a single level of the stack at a time, whereas an exception handling mechanism can jump much further back down the stack in one operation if there is nothing for it to do in the intermediate stack frames.

Solution 4

Matt Pietrek wrote an excellent article on Win32 Structured Exception Handling. While this article was originally written in 1997, it still applies today (but of course only applies to Windows).

Solution 5

This article examines the issue and basically finds that in practice there is a run-time cost to exceptions, although the cost is fairly low if the exception isn't thrown. Good article, recommended.

View more solutions

33,733

Author by

pro-gramer

Updated on July 31, 2022

Comments

pro-gramer almost 2 years

I keep seeing people say that exceptions are slow, but I never see any proof. So, instead of asking if they are, I will ask how do exceptions work behind the scenes, so I can make decisions of when to use them and whether they are slow.

From what I know, exceptions are the same as doing a return bunch of times, except that it also checks after each return whether it needs to do another one or to stop. How does it check when to stop returning? I guess there is a second stack that holds the type of the exception and a stack location, it then does returns until it gets there. I am also guessing that the only time this second stack is touched is on a throw and on each try/catch. AFAICT implementing a similar behaviour with return codes would take the same amount of time. But this is all just a guess, so I want to know what really happens.

How do exceptions really work?
- Mars almost 13 years
  
  Also: stackoverflow.com/questions/1331220/…
Martin York over 15 years

So summary. Not cost if no exceptions are thrown. Some cost when a exception is thrown, but the question is 'Is this cost greater than use and testing error codes all the way back to error handling code'.
Admin over 15 years

"The overhead of exceptions occurs because the compiler has to generate code to keep track of which objects must be destructed in each stack frame (or more precisely scope)" Doesnt the compiler have to do that anyways to destruct objects from a return?
MSalters over 15 years

No. Given a stack with return addresses and a table, the compiler can determine which functions are on the stack. From that, which objects must have been on the stack. This can be done after the exception is thrown. A bit expensive, but only needed when an exception is actually thrown.
MSalters over 15 years

Error costs are indeed likely greater. The exception code is quite possibly still on disk! Since the error handling code is removed from the normal code, the cache behavior in non-error cases improves.
Uhli almost 13 years

I agree in common, but in some cases exceptions may simplify the code. Think about error handling in constructors... - the other ways would be a) return error codes by reference parameters or b) set global variables
supercat over 12 years

On some processors, such as the ARM, returning to an address eight "extra" bytes past a "bl" [branch-and-link, also known as "call"] instruction would cost the same as returning to the address immediately following the "bl". I wonder how the efficiency of simply having each "bl" followed by the address of an "incoming exception" handler would compare with that of a table-based approach, and whether any compilers do such a thing. The biggest danger I can see would be that mismatched calling conventions could cause wacky behavior.
CesarB over 12 years

@supercat: you are polluting your I-cache with exception handling code that way. There is a reason the exception handling code and tables tend to be far away from the normal code, after all.
supercat over 12 years

@CesarB: One instruction word following each call. Doesn't seem too outrageous, especially given that techniques for exception handling using only "outside" code generally require that code maintain a valid frame pointer at all times (which in some cases may require 0 extra instructions, but in others may require more than one).
speedplane over 8 years

I would bet that people hesitate using exceptions, not because of any perceived slowness, but because they don't know how they're implemented and what they're doing to your code. The fact that they seem like magic irks a lot of the close-to-the-metal types.
Martin York over 8 years

@speedplane:I suppose. But the whole point of compilers is so that we don't need to understand the hardware (it provides an abstraction layer). With modern compilers I doubt if you could find a single person who understand every facet of a modern C++ compiler. So why is understanding exceptions different from understanding complex feature X.
speedplane over 8 years

You always need to have some idea of what the hardware is doing, it's a matter of degree. Many that are using C++ (over Java or a scripted language) are often doing so for performance. For them, the abstraction layer should be relatively transparent, so that you have some idea of what's going on in the metal.
Martin York over 8 years

@speedplane: Then they should be using C where the abstraction layer is much thinner by design.
Dmytro about 6 years

hilarious, i was just wondering to myself "wouldn't it be cool if each stack frame kept track of number of objects in it, their types, names, so that my function could dig the stack and see what scopes it inherited during debugging", and in a way, this does something like that, but without manually always declaring a table as the first variable of every scope.
artless noise over 4 years

This is great info. However, I would point out that the eh_frame needs to be stored somewhere. For a PC with demand paging, it is a win to have this out of the normal code path. However, some systems don't have a disk or MMU, so exception tables must be stored in RAM. They are very large. In this simple code sample, they are the about the same order as the code itself! This is not a criticism of the post or C++! However, I have seem many people wonder why exceptions didn't catch on in embedded systems. Tables sizes could be reduced so they aren't restrictive on some systems.
flowit almost 4 years

One thing that is still a bit unclear to me is how it is actually checked weather an exception was thrown. Inside a try block, basically any statement could potentially throw (ignoring noexcept for now). Is is that a CPU flag like overflow is set?