How to Calculate Execution Time of a Code Snippet in C++

c++ benchmarking

173,872

Solution 1

You can use this function I wrote. You call GetTimeMs64(), and it returns the number of milliseconds elapsed since the unix epoch using the system clock - the just like time(NULL), except in milliseconds.

It works on both windows and linux; it is thread safe.

Note that the granularity is 15 ms on windows; on linux it is implementation dependent, but it usually 15 ms as well.

#ifdef _WIN32
#include <Windows.h>
#else
#include <sys/time.h>
#include <ctime>
#endif

/* Remove if already defined */
typedef long long int64; typedef unsigned long long uint64;

/* Returns the amount of milliseconds elapsed since the UNIX epoch. Works on both
 * windows and linux. */

uint64 GetTimeMs64()
{
#ifdef _WIN32
 /* Windows */
 FILETIME ft;
 LARGE_INTEGER li;

 /* Get the amount of 100 nano seconds intervals elapsed since January 1, 1601 (UTC) and copy it
  * to a LARGE_INTEGER structure. */
 GetSystemTimeAsFileTime(&ft);
 li.LowPart = ft.dwLowDateTime;
 li.HighPart = ft.dwHighDateTime;

 uint64 ret = li.QuadPart;
 ret -= 116444736000000000LL; /* Convert from file time to UNIX epoch time. */
 ret /= 10000; /* From 100 nano seconds (10^-7) to 1 millisecond (10^-3) intervals */

 return ret;
#else
 /* Linux */
 struct timeval tv;

 gettimeofday(&tv, NULL);

 uint64 ret = tv.tv_usec;
 /* Convert from micro seconds (10^-6) to milliseconds (10^-3) */
 ret /= 1000;

 /* Adds the seconds (10^0) after converting them to milliseconds (10^-3) */
 ret += (tv.tv_sec * 1000);

 return ret;
#endif
}

Solution 2

I have another working example that uses microseconds (UNIX, POSIX, etc).

    #include <sys/time.h>
    typedef unsigned long long timestamp_t;

    static timestamp_t
    get_timestamp ()
    {
      struct timeval now;
      gettimeofday (&now, NULL);
      return  now.tv_usec + (timestamp_t)now.tv_sec * 1000000;
    }

    ...
    timestamp_t t0 = get_timestamp();
    // Process
    timestamp_t t1 = get_timestamp();

    double secs = (t1 - t0) / 1000000.0L;

Here's the file where we coded this:

https://github.com/arhuaco/junkcode/blob/master/emqbit-bench/bench.c

Solution 3

Here is a simple solution in C++11 which gives you satisfying resolution.

#include <iostream>
#include <chrono>

class Timer
{
public:
    Timer() : beg_(clock_::now()) {}
    void reset() { beg_ = clock_::now(); }
    double elapsed() const { 
        return std::chrono::duration_cast<second_>
            (clock_::now() - beg_).count(); }

private:
    typedef std::chrono::high_resolution_clock clock_;
    typedef std::chrono::duration<double, std::ratio<1> > second_;
    std::chrono::time_point<clock_> beg_;
};

Or on *nix, for c++03

#include <iostream>
#include <ctime>

class Timer
{
public:
    Timer() { clock_gettime(CLOCK_REALTIME, &beg_); }

    double elapsed() {
        clock_gettime(CLOCK_REALTIME, &end_);
        return end_.tv_sec - beg_.tv_sec +
            (end_.tv_nsec - beg_.tv_nsec) / 1000000000.;
    }

    void reset() { clock_gettime(CLOCK_REALTIME, &beg_); }

private:
    timespec beg_, end_;
};

Here is the example usage:

int main()
{
    Timer tmr;
    double t = tmr.elapsed();
    std::cout << t << std::endl;

    tmr.reset();
    t = tmr.elapsed();
    std::cout << t << std::endl;

    return 0;
}

From https://gist.github.com/gongzhitaao/7062087

Solution 4

#include <boost/progress.hpp>

using namespace boost;

int main (int argc, const char * argv[])
{
  progress_timer timer;

  // do stuff, preferably in a 100x loop to make it take longer.

  return 0;
}

When progress_timer goes out of scope it will print out the time elapsed since its creation.

UPDATE: Here's a version that works without Boost (tested on macOS/iOS):

#include <chrono>
#include <string>
#include <iostream>
#include <math.h>
#include <unistd.h>

class NLTimerScoped {
private:
    const std::chrono::steady_clock::time_point start;
    const std::string name;

public:
    NLTimerScoped( const std::string & name ) : name( name ), start( std::chrono::steady_clock::now() ) {
    }


    ~NLTimerScoped() {
        const auto end(std::chrono::steady_clock::now());
        const auto duration_ms = std::chrono::duration_cast<std::chrono::milliseconds>( end - start ).count();

        std::cout << name << " duration: " << duration_ms << "ms" << std::endl;
    }

};

int main(int argc, const char * argv[]) {

    {
        NLTimerScoped timer( "sin sum" );

        float a = 0.0f;

        for ( int i=0; i < 1000000; i++ ) {
            a += sin( (float) i / 100 );
        }

        std::cout << "sin sum = " << a << std::endl;
    }



    {
        NLTimerScoped timer( "sleep( 4 )" );

        sleep( 4 );
    }



    return 0;
}

Solution 5

Windows provides QueryPerformanceCounter() function, and Unix has gettimeofday() Both functions can measure at least 1 micro-second difference.

View more solutions

173,872

Author by

ahmet alp balkan

I am a software engineer on Twitter compute infrastructure team. Previously I've worked at Google Cloud on Kubernetes, Cloud Run and Knative, and at Microsoft Azure on various parts of the Docker open source ecosystem. Find me on my: (blog | twitter | github)

Updated on March 10, 2021

Comments

ahmet alp balkan about 3 years
I have to compute execution time of a C++ code snippet in seconds. It must be working either on Windows or Unix machines.

I use code the following code to do this. (import before)
```
clock_t startTime = clock();
// some code here
// to compute its execution duration in runtime
cout << double( clock() - startTime ) / (double)CLOCKS_PER_SEC<< " seconds." << endl;
```
However for small inputs or short statements such as a = a + 1, I get "0 seconds" result. I think it must be something like 0.0000001 seconds or something like that.

I remember that System.nanoTime() in Java works pretty well in this case. However I can't get same exact functionality from clock() function of C++.

Do you have a solution?
ahmet alp balkan over 14 years

But using windows.h is restricted. The same compiled source must run on both Windows and Unix. How to handle this problem?
Captain Comic over 14 years

Then look for some wrapper library stackoverflow.com/questions/1487695/…
just somebody over 14 years

the same compiled source sounds like you want to run the same binary on both systems, which doesn't seem to be the case. if you meant the same source then an #ifdef must be ok (and it is judging from the answer you have accepted), and then I don't see the problem: #ifdef WIN32 #include <windows.h> ... #else ... #endif.
Petter over 12 years

It relies on the clock() function from the C++ standard header.
davidA over 11 years

This works, but note that progress_timer is deprecated (sometime before boost 1.50) - auto_cpu_timer may be more appropriate.
Tomas Andrle over 11 years

@meowsqueak hmm, auto_cpu_timer seems to require the Boost system library to be linked, so it's no longer a header-only solution. Too bad... makes the other options more appealing all of a sudden.
davidA over 11 years

yes, that's a good point, if you don't already link Boost then it's more trouble than it's worth. But if you already do, it works quite nicely.
Tomas Andrle over 11 years

@meowsqueak Yeah, or for some quick benchmark tests, just get that older version of Boost.
niekas almost 9 years

You should add #include <sys/time.h> at the begining of your example.
user9869932 over 8 years

I am getting this error with your c++11 solution : /usr/lib/x86_64-linux-gnu/libstdc++.so.6: version GLIBCXX_3.4.19 not found (required by ../cpu_2d/g500)
gongzhitaao over 8 years

@julianromera what platform you are using? did you install the libstdc++ library and g++?
user9869932 over 8 years

Its a Slurm grid of Linux ubuntu 12. I just got it fixed. I added -static-libstdc++ at the end of the linker. Thank-you for asking @gongzhitaao
Daniel Handojo about 8 years

For future reference: I just throw it into a header file and use it. Glad to have it.
Azmisov about 8 years

I believe the method gettimeofday can give an unintended result if the system clock is changed. If this would be a problem for you, you might want to look at clock_gettime instead.
MicroVirus about 8 years

Does this method for Windows have any advantages over GetTickCount?
Assimilater over 7 years

Does not compile using gcc -std=c99
Andreas Bonini about 7 years

@MicroVirus: yes, GetTickCount is the time elapsed since the system was started, while my function returns the time since the UNIX epoch which means you can use it for dates and times. If you are only interested in time elapsed between two events mine is still a better choice because it's an int64; GetTickCount is an int32 and overflows every 50 days meaning you can get weird results if the two events you registered are in between the overflow.
Peter Cordes about 5 years

Good answer except for disabling optimization. Benchmarking -O0 code is a big waste of time because the overhead of -O0 instead of a normal -O2 or -O3 -march=native varies wildly depending on the code and the workload. e.g. extra named tmp vars costs time at -O0. There are other ways to avoid having things optimize away, like hiding things from the optimizer with volatile, non-inline functions, or empty inline asm statements. -O0 is not even close to usable because code has different bottlenecks at -O0, not the same but worse.
Peter Cordes about 5 years

Ugh, -Og is still not very realistic, depending on the code. At least -O2, preferably -O3 is more realistic. Use asm volatile("" ::: "+r"(var)) or something to make the compiler materialize a value in a register, and defeat constant propagation through it.
Jack G about 5 years

@PeterCordes Thank you again for your insights. I have updated the content with -O3 and the code snippet with asm volatile("" ::: "+r"(var)).
Peter Cordes about 5 years

asm volatile("" ::: "+r"( i )); seems unnecessary. In optimized code, there's no reason to force the compiler to materialize i as well as i<<7 inside the loop. You're stopping it from optimizing to tmp -= 128 instead of shifting every time. Using the result of a function call is good, though, if it's non-void. Like int result = (*function_to_do)( i << 7 );. You could use an asm statement on that result.
Jack G about 5 years

@PeterCordes Thank you very much again or your insights. My post now contains the corrections for the return value from function_to_do so that function_to_do can be inlined without being eliminated. Please let me know if you have any further suggestions.
Peter Cordes about 5 years

Does this actually compile? You don't have a definition for standard_sqrt, and I don't think (int) (reinterpret_cast<char>(result)) is an lvalue. ("+r" is a read-write register operand; you might only need asm volatile("" :: "r"(input_var) ) instead of asm volatile("" : "+r"(compiler_assumes_this_is_modified) ). And BTW, I had a typo: output operands go after the first colon, clobbers after the third. So asm("" ::: "memory") is a memory barrier, but you can't put "+r"(var) there. See stackoverflow.com/tags/inline-assembly/info
Jack G about 5 years

@PeterCordes I have made the edit you suggested, but I do not know whether it compiles or not. I am slightly busy right now, so learning a new programming language like assembly will have to wait for another day. Thank you so much for that link to assembly tutorials: I shall look into it eventually.
Zheng Qu over 4 years

@TomasAndrle The link does not exist anymore.
Dharman almost 4 years

While this code may solve the question, including an explanation of how and why this solves the problem would really help to improve the quality of your post, and probably result in more up-votes. Remember that you are answering the question for readers in the future, not just the person asking now. Please edit your answer to add explanations and give an indication of what limitations and assumptions apply.