How can I test the performance of a C function?

10,146

Solution 1

You need high-resolution timers.

On Linux, gettimeofday() is a decent choice, it gives you microsecond resolution. On Windows, QueryPerformanceCounter() is typical. Make sure you run your function many times, to get stable readings.

Quick sample, for Linux:

struct timeval t0, t1;
unsigned int i;

gettimeofday(&t0, NULL);
for(i = 0; i < 100000; i++)
  function_to_measure();
gettimeofday(&t1, NULL);
printf("Did %u calls in %.2g seconds\n", i, t1.tv_sec - t0.tv_sec + 1E-6 * (t1.tv_usec - t0.tv_usec));

You would of course adjust the count (100,000) to match the performance of the function. It's best if the function really takes a while to run, otherwise the loop and/or the function call overhead might dominate.

Solution 2

Hello i will give you an example and explain it:

#include <stdio.h>
#include <time.h>

int main(void)
{

    clock_t start_clk = clock();

    /*
        put any code here
    */

    printf("Processor time used by program: %lg sec.\n", \
    (clock() - start_clk) / (long double) CLOCKS_PER_SEC);

    return 0;
}

output: Processor time used by program: 4.94066e-324 sec.

time.h:

declares clock_t which is an arithmetic(you can do math on this value like i do in the example) time value. basically put any code where the comment is.

CLOCKS_PER_SEC is a macro declared in time.h, use it as the denominator to convert the value into seconds.

it is essential to cast it to long double for two reasons:

  1. we don't know what type clock_t actually is, but we want to print it (what conversion will you put in printf?).
  2. long double is a very precise type which can represent really small values.

Solution 3

The open-source Callgrind profiler (for Linux) is a really awesome way to measure performance. Coupled with KCacheGrind, you get really great visualizations of where your time is spent.

Callgrind is part of Valgrind.

  • Art

Solution 4

Store off the system time before you enter the function. Store off the system time after you return from the function. Subtract the difference and compare the two implementations.

Solution 5

Run it (them) several million times (each) and measure the time it takes.
The one that completes faster is the better performant.

gprof can help :)

Here's the result of gprof when I run a program of mine for 10 seconds (function names changed)

Each sample counts as 0.01 seconds.
  %   cumulative   self              self     total
 time   seconds   seconds    calls  ms/call  ms/call  name
 60.29      8.68     8.68 115471546     0.00     0.00  workalot
 39.22     14.32     5.64       46   122.70   311.32  work_b
  0.49     14.39     0.07                             inlined
  0.07     14.40     0.01       46     0.22     0.22  work_c
  0.00     14.40     0.00      460     0.00     0.00  find_minimum
  0.00     14.40     0.00      460     0.00     0.00  feedback
  0.00     14.40     0.00       46     0.00     0.00  work_a
Share:
10,146
Fred
Author by

Fred

Updated on August 02, 2022

Comments

  • Fred
    Fred over 1 year

    Are there some good ways to know how a function performs in C? I would like to, for example compare my own function to a library function.

  • Cascabel
    Cascabel over 14 years
    And of course, loop over it enough times that you don't get a difference of zero.
  • T.E.D.
    T.E.D. over 14 years
    I'd agree with this generally. However, the first iteration is liable to be a lot slower than the rest, due to caching issues. If the routine is typically done only once, rather than in a tight loop, this will give you a skewed picture. OTOH, if the routine is only done once, you shouldn't be wasting valuable time trying to profile or optimizing it either.
  • Fred
    Fred over 14 years
    Thanks for the tip and example. I run mac os here so gettimeofday() is available here as well.
  • Fred
    Fred over 14 years
    Thanks pmg, I will check out gprof. I noticed that I even had it installed by default.
  • Adriaan
    Adriaan over 14 years
    This works ok if the function depends only on memory and cpu and does not change states (i.e. runs the same every time). If your function has file access, you may be fooled by the filesystem caching.
  • RLewis
    RLewis over 14 years
    T.E.D. makes a couple of excellent points. The CPU cache and caching by the OS will vastly improve the performance of your function on all but the first iteration, giving you an average performance far exeeding what you get if the function is run alone or in-between other functions meaty enough to replace the contents of the CPU cache. But this is probably the best simple profiling technique there is, and will still give you a ballpark good/acceptable/terrible performance figure.
  • Fred
    Fred over 14 years
    That interesting, i did not know that there were assebly instructions for this. Might have to try this as well to see how it works.
  • David Thornley
    David Thornley over 14 years
    If you mix up the calls, you'll tend to avoid the effect T.E.D. noticed. On the other hand, caching will affect all functions, and the effect might even out.
  • Fred
    Fred over 14 years
    Yes, but wouldn't functions with many subsequent calls be the ones where performance is most important. Comparing a tight loop whith a function call in each iteration, to one single call. The performance of a single call is less likely to matter the way I see it.
  • Fred
    Fred over 14 years
    Thanks Stephen, excelent! I will try this out.
  • Stephen Canon
    Stephen Canon over 14 years
    If you hit any problems, let me know; I typed all this from memory, so I might have made an error somewhere =)