How to measure cpu time and wall clock time?
According to my manual page on clock
it says
POSIX requires that CLOCKS_PER_SEC equals 1000000 independent of the actual resolution.
When increasing the number iterations on my computer the measured cpu-time starts showing on 100000 iterations. From the returned figures it seems the resolution is actually 10 millisecond.
Beware that when you optimize your code, the whole loop may disappear because sum
is a dead value. There is also nothing to stop the compiler from moving the clock
statements across the loop as there are no real dependences with the code in between.
Let me elaborate a bit more on micro measurements of performance of code. The naive and tempting way to measure performance is indeed by adding clock
statements as you have done. However since time is not a concept or side effect in C, compilers can often move these clock
calls at will. To remedy this it is tempting to make such clock
calls have side effects by for example having it access volatile
variables. However this still doesn't prohibit the compiler from moving highly side-effect free code over the calls. Think for example of accessing regular local variables. But worse, by making the clock
calls look very scary to the compiler, you will actually negatively impact any optimizations. As a result, mere measuring of the performance impacts that performance in a negative and undesirable way.
If you use profiling, as already mentioned by someone, you can get a pretty good assessment of the performance of even optimized code, although the overall time of course is increased.
Another good way to measure performance is just asking the compiler to report the number of cycles some code will take. For a lot of architectures the compiler has a very accurate estimate of this. However most notably for a Pentium architecture it doesn't because the hardware does a lot of scheduling that is hard to predict.
Although it is not standing practice I think compilers should support a pragma
that marks a function to be measured. The compiler then can include high precision non-intrusive measuring points in the prologue and epilogue of a function and prohibit any inlining of the function. Depending on the architecture it can choose a high precision clock to measure time, preferably with support from the OS to only measure time of the current process.
Comments
-
Brian Brown almost 2 years
I saw many topics about this, even on stackoverflow, for example:
How can I measure CPU time and wall clock time on both Linux/Windows?
I want to measure both cpu and wall time. Although person who answered a question in topic I posted recommend using
gettimeofday
to measure a wall time, I read that its better to use insteadclock_gettime
. So, I wrote the code below (is it ok, is it really measure a wall time, not cpu time? Im asking, cause I found a webpage: http://nadeausoftware.com/articles/2012/03/c_c_tip_how_measure_cpu_time_benchmarking#clockgettme where it says thatclock_gettime
measures a cpu time...) Whats the truth and which one should I use to measure a wall time?Another question is about cpu time. I found the answer that
clock
is great about it, so I wrote a sample code for it too. But its not what I really want, for my code it shows me a 0 secods of cpu time. Is it possible to measure cpu time more precisely (in seconds)? Thanks for any help (for now on, Im interested only in Linux solutions).Heres my code:
#include <time.h> #include <stdio.h> /* printf */ #include <math.h> /* sqrt */ #include <stdlib.h> int main() { int i; double sum; // measure elapsed wall time struct timespec now, tmstart; clock_gettime(CLOCK_REALTIME, &tmstart); for(i=0; i<1024; i++){ sum += log((double)i); } clock_gettime(CLOCK_REALTIME, &now); double seconds = (double)((now.tv_sec+now.tv_nsec*1e-9) - (double)(tmstart.tv_sec+tmstart.tv_nsec*1e-9)); printf("wall time %fs\n", seconds); // measure cpu time double start = (double)clock() /(double) CLOCKS_PER_SEC; for(i=0; i<1024; i++){ sum += log((double)i); } double end = (double)clock() / (double) CLOCKS_PER_SEC; printf("cpu time %fs\n", end - start); return 0; }
Compile it like this:
gcc test.c -o test -lrt -lm
and it shows me:
wall time 0.000424s cpu time 0.000000s
I know I can make more iterations but thats not the point here ;)
IMPORTANT:
printf("CLOCKS_PER_SEC is %ld\n", CLOCKS_PER_SEC);
shows
CLOCKS_PER_SEC is 1000000