How to measure cpu time and wall clock time?

21,270

According to my manual page on clock it says

POSIX requires that CLOCKS_PER_SEC equals 1000000 independent of the actual resolution.

When increasing the number iterations on my computer the measured cpu-time starts showing on 100000 iterations. From the returned figures it seems the resolution is actually 10 millisecond.

Beware that when you optimize your code, the whole loop may disappear because sum is a dead value. There is also nothing to stop the compiler from moving the clock statements across the loop as there are no real dependences with the code in between.

Let me elaborate a bit more on micro measurements of performance of code. The naive and tempting way to measure performance is indeed by adding clock statements as you have done. However since time is not a concept or side effect in C, compilers can often move these clock calls at will. To remedy this it is tempting to make such clock calls have side effects by for example having it access volatile variables. However this still doesn't prohibit the compiler from moving highly side-effect free code over the calls. Think for example of accessing regular local variables. But worse, by making the clock calls look very scary to the compiler, you will actually negatively impact any optimizations. As a result, mere measuring of the performance impacts that performance in a negative and undesirable way.

If you use profiling, as already mentioned by someone, you can get a pretty good assessment of the performance of even optimized code, although the overall time of course is increased.

Another good way to measure performance is just asking the compiler to report the number of cycles some code will take. For a lot of architectures the compiler has a very accurate estimate of this. However most notably for a Pentium architecture it doesn't because the hardware does a lot of scheduling that is hard to predict.

Although it is not standing practice I think compilers should support a pragma that marks a function to be measured. The compiler then can include high precision non-intrusive measuring points in the prologue and epilogue of a function and prohibit any inlining of the function. Depending on the architecture it can choose a high precision clock to measure time, preferably with support from the OS to only measure time of the current process.

Share:
21,270
Brian Brown
Author by

Brian Brown

B'ham fellow

Updated on July 09, 2022

Comments

  • Brian Brown
    Brian Brown almost 2 years

    I saw many topics about this, even on stackoverflow, for example:

    How can I measure CPU time and wall clock time on both Linux/Windows?

    I want to measure both cpu and wall time. Although person who answered a question in topic I posted recommend using gettimeofday to measure a wall time, I read that its better to use instead clock_gettime. So, I wrote the code below (is it ok, is it really measure a wall time, not cpu time? Im asking, cause I found a webpage: http://nadeausoftware.com/articles/2012/03/c_c_tip_how_measure_cpu_time_benchmarking#clockgettme where it says that clock_gettime measures a cpu time...) Whats the truth and which one should I use to measure a wall time?

    Another question is about cpu time. I found the answer that clock is great about it, so I wrote a sample code for it too. But its not what I really want, for my code it shows me a 0 secods of cpu time. Is it possible to measure cpu time more precisely (in seconds)? Thanks for any help (for now on, Im interested only in Linux solutions).

    Heres my code:

    #include <time.h>
    #include <stdio.h>      /* printf */
    #include <math.h>       /* sqrt */
    #include <stdlib.h>
    
    int main()
    {
        int i;
        double sum;
    
        // measure elapsed wall time
        struct timespec now, tmstart;
        clock_gettime(CLOCK_REALTIME, &tmstart);
        for(i=0; i<1024; i++){
            sum += log((double)i);
        }
        clock_gettime(CLOCK_REALTIME, &now);
        double seconds = (double)((now.tv_sec+now.tv_nsec*1e-9) - (double)(tmstart.tv_sec+tmstart.tv_nsec*1e-9));
        printf("wall time %fs\n", seconds);
    
        // measure cpu time
        double start = (double)clock() /(double) CLOCKS_PER_SEC;
        for(i=0; i<1024; i++){
            sum += log((double)i);
        }
        double end = (double)clock() / (double) CLOCKS_PER_SEC;
        printf("cpu time %fs\n", end - start);
    
        return 0;
    }
    

    Compile it like this:

    gcc test.c -o test -lrt -lm

    and it shows me:

    wall time 0.000424s
    cpu time 0.000000s
    

    I know I can make more iterations but thats not the point here ;)

    IMPORTANT:

    printf("CLOCKS_PER_SEC is %ld\n", CLOCKS_PER_SEC);
    

    shows

    CLOCKS_PER_SEC is 1000000