Measuring execution time of a function in C++

c++ optimization profiling

283,584

Solution 1

It is a very easy-to-use method in C++11. You have to use std::chrono::high_resolution_clock from <chrono> header.

Use it like so:

#include <chrono>

/* Only needed for the sake of this example. */
#include <iostream>
#include <thread>
    
void long_operation()
{
    /* Simulating a long, heavy operation. */

    using namespace std::chrono_literals;
    std::this_thread::sleep_for(150ms);
}

int main()
{
    using std::chrono::high_resolution_clock;
    using std::chrono::duration_cast;
    using std::chrono::duration;
    using std::chrono::milliseconds;

    auto t1 = high_resolution_clock::now();
    long_operation();
    auto t2 = high_resolution_clock::now();

    /* Getting number of milliseconds as an integer. */
    auto ms_int = duration_cast<milliseconds>(t2 - t1);

    /* Getting number of milliseconds as a double. */
    duration<double, std::milli> ms_double = t2 - t1;

    std::cout << ms_int.count() << "ms\n";
    std::cout << ms_double.count() << "ms\n";
    return 0;
}

This will measure the duration of the function long_operation.

Possible output:

150ms
150.068ms

Working example: https://godbolt.org/z/oe5cMd

Solution 2

Here's a function that will measure the execution time of any function passed as argument:

#include <chrono>
#include <utility>

typedef std::chrono::high_resolution_clock::time_point TimeVar;

#define duration(a) std::chrono::duration_cast<std::chrono::nanoseconds>(a).count()
#define timeNow() std::chrono::high_resolution_clock::now()

template<typename F, typename... Args>
double funcTime(F func, Args&&... args){
    TimeVar t1=timeNow();
    func(std::forward<Args>(args)...);
    return duration(timeNow()-t1);
}

Example usage:

#include <iostream>
#include <algorithm>

typedef std::string String;

//first test function doing something
int countCharInString(String s, char delim){
    int count=0;
    String::size_type pos = s.find_first_of(delim);
    while ((pos = s.find_first_of(delim, pos)) != String::npos){
        count++;pos++;
    }
    return count;
}

//second test function doing the same thing in different way
int countWithAlgorithm(String s, char delim){
    return std::count(s.begin(),s.end(),delim);
}


int main(){
    std::cout<<"norm: "<<funcTime(countCharInString,"precision=10",'=')<<"\n";
    std::cout<<"algo: "<<funcTime(countWithAlgorithm,"precision=10",'=');
    return 0;
}

Output:

norm: 15555
algo: 2976

Solution 3

In Scott Meyers book I found an example of universal generic lambda expression that can be used to measure function execution time. (C++14)

auto timeFuncInvocation = 
    [](auto&& func, auto&&... params) {
        // get time before function invocation
        const auto& start = std::chrono::high_resolution_clock::now();
        // function invocation using perfect forwarding
        std::forward<decltype(func)>(func)(std::forward<decltype(params)>(params)...);
        // get time after function invocation
        const auto& stop = std::chrono::high_resolution_clock::now();
        return stop - start;
     };

The problem is that you are measure only one execution so the results can be very differ. To get a reliable result you should measure a large number of execution. According to Andrei Alexandrescu lecture at code::dive 2015 conference - Writing Fast Code I:

Measured time: tm = t + tq + tn + to

where:

tm - measured (observed) time

t - the actual time of interest

tq - time added by quantization noise

tn - time added by various sources of noise

to - overhead time (measuring, looping, calling functions)

According to what he said later in the lecture, you should take a minimum of this large number of execution as your result. I encourage you to look at the lecture in which he explains why.

Also there is a very good library from google - https://github.com/google/benchmark. This library is very simple to use and powerful. You can checkout some lectures of Chandler Carruth on youtube where he is using this library in practice. For example CppCon 2017: Chandler Carruth “Going Nowhere Faster”;

Example usage:

#include <iostream>
#include <chrono>
#include <vector>
auto timeFuncInvocation = 
    [](auto&& func, auto&&... params) {
        // get time before function invocation
        const auto& start = high_resolution_clock::now();
        // function invocation using perfect forwarding
        for(auto i = 0; i < 100000/*largeNumber*/; ++i) {
            std::forward<decltype(func)>(func)(std::forward<decltype(params)>(params)...);
        }
        // get time after function invocation
        const auto& stop = high_resolution_clock::now();
        return (stop - start)/100000/*largeNumber*/;
     };

void f(std::vector<int>& vec) {
    vec.push_back(1);
}

void f2(std::vector<int>& vec) {
    vec.emplace_back(1);
}
int main()
{
    std::vector<int> vec;
    std::vector<int> vec2;
    std::cout << timeFuncInvocation(f, vec).count() << std::endl;
    std::cout << timeFuncInvocation(f2, vec2).count() << std::endl;
    std::vector<int> vec3;
    vec3.reserve(100000);
    std::vector<int> vec4;
    vec4.reserve(100000);
    std::cout << timeFuncInvocation(f, vec3).count() << std::endl;
    std::cout << timeFuncInvocation(f2, vec4).count() << std::endl;
    return 0;
}

EDIT: Ofcourse you always need to remember that your compiler can optimize something out or not. Tools like perf can be useful in such cases.

Solution 4

simple program to find a function execution time taken.

#include <iostream>
#include <ctime> // time_t
#include <cstdio>

void function()
{
     for(long int i=0;i<1000000000;i++)
     {
        // do nothing
     }
}

int main()
{

time_t begin,end; // time_t is a datatype to store time values.

time (&begin); // note time before execution
function();
time (&end); // note time after execution

double difference = difftime (end,begin);
printf ("time taken for function() %.2lf seconds.\n", difference );

return 0;
}

Solution 5

Easy way for older C++, or C:

#include <time.h> // includes clock_t and CLOCKS_PER_SEC

int main() {

    clock_t start, end;

    start = clock();
    // ...code to measure...
    end = clock();

    double duration_sec = double(end-start)/CLOCKS_PER_SEC;
    return 0;
}

Timing precision in seconds is 1.0/CLOCKS_PER_SEC

View more solutions

283,584

Xara

Updated on December 22, 2021

Comments

Xara over 2 years
I want to find out how much time a certain function takes in my C++ program to execute on Linux. Afterwards, I want to make a speed comparison . I saw several time function but ended up with this from boost. Chrono:
```
process_user_cpu_clock, captures user-CPU time spent by the current process
```
Now, I am not clear if I use the above function, will I get the only time which CPU spent on that function?

Secondly, I could not find any example of using the above function. Can any one please help me how to use the above function?

P.S: Right now , I am using std::chrono::system_clock::now() to get time in seconds but this gives me different results due to different CPU load every time.
- Brandon about 10 years
  
  For Linux use: clock_gettime.. gcc defines other clocks as: typedef system_clock steady_clock; typedef system_clock high_resolution_clock; on Windows, use QueryPerformanceCounter.
- northerner over 5 years
  
  Isn't this question a duplicate of this one or do the scenarios make the solutions different?
- northerner over 5 years
  
  I have two implementations of a function and would like to find which performs better.
- Peter Cordes about 4 years
  
  Very important: make sure you enable optimization. Un-optimized code has different bottlenecks than normal optimized code, and does not tell you anything meaningful. C loop optimization help for final assignment (with compiler optimization disabled). And in general microbenchmarking has many pitfalls, especially failure to do a warm-up loop first for CPU-frequency and page faults: Idiomatic way of performance evaluation?. And this answer
- Peter Cordes about 4 years
  
  See also How would you benchmark the performance of a function? for Google Benchmark which avoids many of the pitfalls of rolling your own microbenchmark. Also Simple for() loop benchmark takes the same time with any loop bound for more about how optimization interacts with benchmark loops, and what to do about it.
Xara about 10 years

When I use this function, on first run it gave me 118440535 microseconds and on second run of the same function it gave me 83221031 microseconds. Shouldn't the two time measurements be equal when I am measuring the duration of that function only ?
Victor about 10 years

No. The processor of your computer can be used less or more. The high_resolution_clock will give you the physical and real time that your function takes to run. So, in your first run, your CPU was being used less than in the next run. By "used" I mean what other application work uses the CPU.
Xara about 10 years

In that case, do I need to take multiple readings and take out the average of the time. As I have to do speed comparison of the three different functions?
Victor about 10 years

Yes, if you need the average of the time, that is a good way to get it. take three runs, and calculate the average.
Victor about 10 years

BTW, on my computer, the example I provided takes 5 thousand microseconds.
Xara about 10 years

Ok.Actually I got those readings when I used high resolution function for my own program function.
Jahid over 7 years

@RestlessC0bra : It's implementaion defined, high_resolution_clock may be an alias of system_clock (wall clock), steady_clock or a third independent clock. See details here. For cpu clock, std::clock may be used
MikeMB about 7 years

Two macros and a global typedef - none of which safe a single keytroke - is certainly nothing I'd call elegant.Also passing a function object and perfectly forwarding the arguments separately is a bit of an overkill (and in the case of overloaded functions even inconvenient), when you can just require the timed code to be put in a lambda. But well, as long as passing arguments is optional.
Jahid about 7 years

@MikeMB : timeNow() is called twice, now calculate how many keystrokes are saved, and yeah, think about readability too.
MikeMB about 7 years

You are right, my mistake. That one (out of three) does save some keystrokes. Still no reason to use macros or global typedefs for something that is only used within a single function
Jahid about 7 years

@MikeMB : Indeed it's a single function here, but if you broaden your vision a little, you can see how easy it can make measuring times of different code segments. For example, I use those three to measure execution times of code segments inside a function or global scope. Those macros and teypedef do come in handy.
MikeMB about 7 years

And this is a justification for violating each and every guideline about the naming of macros? You don't prefix them, you don't use capital letters, you pick a very common name that has a high probability of colliding with some local symbol and most of all: Why are you using a macro at all (instead of a function)? And while we are at it: Why are you returning the duration as a double representing nanoseconds in the first place? We should probably agree that we disagree. My original opinion stands: "This is not what I'd call elegant code".
Jahid about 7 years

@MikeMB : All caps? C++ standard itself violates that so called naming convention (e.g assert), and also where's the prefix in assert? common name? remember that you are the one that will be coding that thing, if you are worried about conflicting a yourself-defined name like fooWhatever, you should stop coding and do something else instead (because in coding you will have to define lots of names, how dangerous!!!, LOL). Your time will be better spent elsewhere. Then again, agreed to disagree.
MikeMB almost 7 years

The problem is they are unscoped.What I'm worried about is that such macros end up in a header file that gets(maybe indirectly as part of a library) included in my code.If you want to have a taste of what happens if common names are used for macros,include windows.h in a non-trivial c++ project. Regarding assert first of all: "quod licet iovi non licet bovi" ;). Second, not all decision in the standard library (sometimes dating back decades) are actually considered a good idea by modern standards. There is a reason,why the c++ modules designer try very hard not to export macros by default.
MikeMB almost 7 years

... In case this is not clear: My beef is not with you having something that is called e.g. timeNow() or duration, but that it is a macro, when a normal function that would obey normal lookup rules and namespaces would do just fine. And the typedef could be local to the funcTime function, as it isn't used anywhere else - or you could just use auto and not need a typedef at all.
Jahid almost 7 years

@MikeMB : Good point, making this a header would definitely be a bad idea. Though, in the end, it's just an example, if you have complex needs you gotta think about standard practices and adapt the code accordingly. For example, when writing code, I make it convenient for me when it's in the cpp file I am working right now, but when it's time to move it elsewhere I take every necessary steps to make it robust so that I don't have to look at it again. And I think that, every programmer out there who are not complete noobs think broadly when the time is due. Hope, I clarified my point :D.
MikeMB almost 7 years

@Jahid: Thanks. In that case consider my comments void and null.
Casey about 6 years

Personally, the stop() member function isn't needed because the destructor stops the timer for you.
Francis Cugler about 6 years

@Casey The design of the class doesn't necessarily need the stop function, however it is there for a specific reason. The default construct when creating the object before your test code starts the timer. Then after your test code you explicitly use the timer object and call its stop method. You have to invoke it manually when you want to stop the timer. The class doesn't take any parameters. Also if you used this class just as I've shown you will see that there is a minimal elapse of time between the call to obj.stop and its destructor.
Francis Cugler about 6 years

@Casey ... This also allows to have multiple timer objects within the same scope, not that one would really need it, but just another viable option.
user25 about 6 years

it's very inaccurate, shows only seconds, but no milliseconds
user48956 over 5 years

Interesting -- what's the benefit of using a lambda here over a function template?
Krzysztof Sommerfeld over 5 years

Main difference would be that it is a callable object but indeed you can get something very similar with variadic template and std::result_of_t.
Snowman about 5 years

Could you please post code without "using namespace" in general. It makes it easier to see what comes from where.
BugSquasher about 5 years

This is not portable. It measures processor time on Linux, and clock time on Windows.
Celdor about 5 years

This example cannot be compiled in the presented form. The error is related to "no match for operator<< ..."!
Francis Cugler about 5 years

@Celdor do you have to appropriate includes; such as <chrono>?
Celdor about 5 years

It was a missing include, can’t check it right now but I think it was <iomanip>. I was trying adding ostream, iostream etc. It was hard to find the right one!
Gillespie almost 5 years

Shouldn't this be a steady_clock? Isn't it possible high_resolution_clock could be a non-monotonic clock?
Victor over 4 years

@Gillespie std::chrono::high_resolution_clock offers the is_steady constant and you can check whether it is monotonic or not. Indeed steady_clock is always monotonic, but I guess the user can always check if the implementation of high_resolution_clock is actually monotonic or not, and choose between the two
Gunnar Bernstein over 4 years

See also example for thread sleep: en.cppreference.com/w/cpp/thread/sleep_for
RobinAtTech over 4 years

@Jahid I have a class member function with no arguments and return type is void. Could you please let me know how to use this mechanism for my case, currenly it give compiler error
RobinAtTech over 4 years

@KrzysztofSommerfeld How to do this one for function methods , when I pass the timing(Object.Method1) it return error "non-standard syntax; use '&' to create a pointer to member"
Krzysztof Sommerfeld over 4 years

timeFuncInvocation([&objectName](auto&&... args){ objectName.methodName(std::forward<decltype(args)>(args)...)‌; }, arg1, arg2,...); or ommit & sign before objectName (then you will have a copy of the object)
T.s. Arun over 4 years

@Jahid How do you intend to get the function return value?
Jahid over 4 years

@T.s.Arun : Use a pointer to store the return value inside funcTime()
Maverick almost 4 years

start and end time are always the same, despite I add an array of 512 elements..... under Win64/Visual Studio 17
Victor over 3 years

You should rather use something like clock_gettime and process the results within a struct timespec result. But this is a C solution rather than a C++ one.
Kai Petzke about 3 years

BTW: I recommend changing long long number to volatile long long number. Otherwise, the optimizer will likely optimize away that loop and you will get a running time of zero.
Victor about 3 years

Or compile with -O0 just for the purpose of this testing :)
Saurav about 3 years

Using class methods give errors like using invalid usage of non-static member function. How would you suggest to cope with that?
v.chaplin almost 3 years

I'm not sure what would cause that, but if you're using C++ then best to switch over to the standard <chrono> methods.
Victor over 2 years

@BobbieE.Ray all these classes are brought via using X, not via using namespace. An older version of the answer had using namespace, but not the current one.
Bobbie E. Ray over 2 years

Just to clarify if you don't want to use using X as it can be a bit confusing, it's std::chrono::high_resolution_clock, std::chrono::duration<double, std::milli> and std::chrono::duration_cast<std::chrono::milliseconds>(t2 - t1). Thanks @Victor!
Antoine Viallon over 2 years

@Snowman using namespace std::chrono_literals; is the only way to use them (the chrono literals) and is much more readable than using std::chrono::duration<double, std::ratio<1,1>> paf {150e-3};