What is the cost of a function call?

21,408

Solution 1

relative timings (shouldn't be off by more than a factor of 100 ;-)

  • memory-access in cache = 1
  • function call/return in cache = 2
  • memory-access out of cache = 10 .. 300
  • disk access = 1000 .. 1e8 (amortized depends upon the number of bytes transferred)
    • depending mostly upon seek times
    • the transfer itself can be pretty fast
    • involves at least a few thousand ops, since the user/system threshold must be crossed at least twice; an I/O request must be scheduled, the result must be written back; possibly buffers are allocated...
  • network calls = 1000 .. 1e9 (amortized depends upon the number of bytes transferred)
    • same argument as with disk i/o
    • the raw transfer speed can be quite high, but some process on the other computer must do the actual work

Solution 2

A function call is simply a shift of the frame pointer in memory onto the stack and addition of a new frame on top of that. The function parameters are shifted into local registers for use and the stack pointer is advanced to the new top of the stack for execution of the function.

In comparison with time

Function call ~ simple memory access
Function call < Disk Access
Function call < memory access on another computer
Function call < disk access on another computer

Solution 3

Compared to a simple memory access - slightly more, negligible really.

Compared to every thing else listed - orders of magnitude less.

This should hold true for just about any language on any OS.

Solution 4

In general, a function call is going to be slightly slower than memory access since it in fact has to do multiple memory accesses to perform the call. For example, multiple pushes and pops of the stack are required for most function calls using __stdcall on x86. But if your memory access is to a page that isn't even in the L2 cache, the function call can be much faster if the destination and the stack are all in the CPU's memory caches.

For everything else, a function call is many (many) magnitudes faster.

Solution 5

Hard to answer because there are a lot of factors involved.

First of all, "Simple Memory Access" isn't simple. Since at modern clock speeds, a CPU can add two numbers faster than it get a number from one side of the chip to the other (The speed of light -- It's not just a good idea, it's the LAW)

So, is the function being called inside the CPU memory cache? Is the memory access you're comparing it too?

Then we have the function call will clear the CPU instruction pipeline, which will affect speed in a non-deterministic way.

Share:
21,408
FlinkmanSV
Author by

FlinkmanSV

https://bitcoinsv.academy/ https://bsvquickstart.com/ https://www.youtube.com/watch?v=3MJSEGnpgB8 https://www.reddit.com/r/bitcoincashSV/

Updated on March 05, 2020

Comments

  • FlinkmanSV
    FlinkmanSV about 4 years

    Compared to

    • Simple memory access
    • Disk access
    • Memory access on another computer(on the same network)
    • Disk access on another computer(on the same network)

    in C++ on windows.

  • Mladen Janković
    Mladen Janković over 15 years
    Pipelines should not be flushed on modern CPU when call is made. CPU flushes its pipelines only when it wrongly predicts conditional jump (or when interrupt or similar unpredictable event occurs).
  • James Curran
    James Curran over 15 years
    "The law" : The electrical impulses which make up a value in memory, moving at the speed of light, will take several CPU cycles to move the three inches from one end of the CPU to the other.
  • Suma
    Suma over 15 years
    Did you really mean "shift of the stack pointer"? "Shift of the instruction pointer" would see more appropriate to me. I would describe it as: IP is pushed onto stack IP is set to the function address SP is advanced to reserve local variables spaces (may include stack frame creation)
  • jW.
    jW. over 15 years
    Thank you for the clarification! I will update my answer for anyone who browses in the future
  • Martin York
    Martin York over 15 years
    It's the LAW! The definition of a law is that it is meant to be interpreted. Rules on the other hand are absolute. That is why Rugby has Laws and other sports have Rules.
  • Asaf R
    Asaf R over 15 years
    All agreed, except from what I know memory access out of cache is about 200~300 times cache access.
  • mfx
    mfx over 15 years
    Changed. The 10... range is to include first-level caches. Also, i don't want to create the impression that a disk access can be as fast as a memory access; the situation is not that bad ;-)
  • Michael Borgwardt
    Michael Borgwardt about 14 years
    @James: three inches?? I'd like to know what kind of monster CPU you are working with...
  • James Curran
    James Curran about 14 years
    @Michael... OkOkOk... A modern Intel CPU is 1.95"x1.95" meaning the corner to corner distance is only 2.75" --- except the signals aren't moving in a straight line-- but regardless, it's still several cycles at the speed of light.
  • Michael Borgwardt
    Michael Borgwardt about 14 years
    @James: the actual processor die is much smaller than that. The Core i7's die is 263mm^2, which is (assuming a square) about 0.64"x0.64"
  • einpoklum
    einpoklum over 8 years
    In your cache figures - aren't you forgetting the argument handling?