speed comparison between fgetc/fputc and fread/fwrite in C

13,152

Solution 1

fread() is not calling fgetc() to read each byte.

It behaves as if calling fgetc() repeatedly, but it has direct access to the buffer that fgetc() reads from so it can directly copy a larger quantity of data.

Solution 2

You are forgetting about file buffering (inode, dentry and page caches).

Clear them before you run:

echo 3 > /proc/sys/vm/drop_caches

Backgrounder:

Benchmarking is an art. Refer to bonnie++, iozone and phoronix for proper filesystem benchmarking. As a characteristic, bonnie++ won't allow a benchmark with a written volume of less than 2x the available system memory.

Why?

(answer: buffering effects!)

Solution 3

stdio functions will fill a read buffer, of size "BUFSIZ" as defined in stdio.h, and will only make one read(2) system call every time that buffer is drained. They will not do an individual read(2) system call for every byte consumed -- they read large chunks. BUFSIZ is typically something like 1024 or 4096.

You can also adjust that buffer's size, if you wish, to increase it -- see the man pages for setbuf/setvbuf/setbuffer on most systems -- though that is unlikely to make a huge difference in performance.

On the other hand, as you note, you can make a read(2) system call of arbitrary size by setting that size in the call, though you get diminishing returns with that at some point.

BTW, you might as well use open(2) and not fopen(3) if you are doing things this way. There is little point in fopen'ing a file you are only going to use for its file descriptor.

Solution 4

Like sehe says its partly because buffering, but there is more to it and I'll explain why is that and at the same why fgetc() will give more latency.

fgetc() is called for every byte that is read from from file.

fread() is called for every n bytes of the local buffer for file data.

So for a 10MiB file:

fgetc() is called: 10 485 760 times

While fread with a 1KiB buffer the function called 10 240 times.

Lets say for simplicity that every function call takes 1ms: fgetc would take 10 485 760 ms = 10485.76 seconds ~ 2,9127 hours fread would take 10 240 ms = 10.24 seconds

On top of that the OS does reading and writing on usually the same device, I suppose your example does it on the same hard disk. The OS when reading your source file, move the hard disk heads over the spinning disk platters seeking the file and then reads 1 byte, put it on memory, then move again the reading/writing head over the hard disk spinning platters looking on the place that the OS and the hard disk controller agreed to locate the destination file and then writes 1 byte from memory. For the above example this happens over 10 million times for each file: totaling over 20 million times, using the buffered version this happens just a grand total of over 20 000 times.

Besides that the OS when reading the disk puts in memory a few more KiB of hard disk data for performance purposes, an this can speed up the program even when using the less efficient fgetc because the program read from the OS memory instead of reading directly from the hard disk. This is to what sehe's response refers.

Depending on your machine configuration/load/OS/etc your results from reading and writing can vary a lot, hence his recommendation to empty the disk caches to grasp better more meaningful results.

When source and destination files are on different hdd things are a lot faster. With SDDs I'm not really sure if reading/writing are absolutely exclusive of each other.

Summary: Every call to a function has certain overhead, reading from a HDD has other overheads and caches/buffers help to get things faster.

Other info

http://en.wikipedia.org/wiki/Disk_read-and-write_head

http://en.wikipedia.org/wiki/Hard_disk#Components

Share:
13,152
pratikm
Author by

pratikm

Updated on July 26, 2022

Comments

  • pratikm
    pratikm almost 2 years

    So(just for fun), i was just trying to write a C code to copy a file. I read around and it seems that all the functions to read from a stream call fgetc() (I hope this is this true?), so I used that function:

    #include <stdio.h>
    #include <stdlib.h>
    #include <time.h>
    #define FILEr "img1.png"
    #define FILEw "img2.png"
    main()
    {
        clock_t start,diff;
        int msec;
        FILE *fr,*fw;
        fr=fopen(FILEr,"r");
        fw=fopen(FILEw,"w");
        start=clock();
        while((!feof(fr)))
            fputc(fgetc(fr),fw);
        diff=clock()-start;
        msec=diff*1000/CLOCKS_PER_SEC;
        printf("Time taken %d seconds %d milliseconds\n", msec/1000, msec%1000);
        fclose(fr);
        fclose(fw);
    }
    

    This gave a run time of 140 ms for this file on a 2.10Ghz core2Duo T6500 Dell inspiron laptop. However, when I try using fread/fwrite, I get decreasing run time as I keep increasing the number of bytes(ie. variable st in the following code) transferred for each call until it peaks at around 10ms! Here is the code:

    #include <stdio.h>
    #include <stdlib.h>
    #include <time.h>
    #define FILEr "img1.png"
    #define FILEw "img2.png"
    main()
    {
        clock_t start,diff;
        // number of bytes copied at each step
        size_t st=10000;
        int msec;
        FILE *fr,*fw;
        // placeholder for value that is read
        char *x;
        x=malloc(st);
        fr=fopen(FILEr,"r");
        fw=fopen(FILEw,"w");
        start=clock();
        while(!feof(fr))
         {
            fread(x,1,st,fr);
            fwrite(x,1,st,fw);
         }
        diff=clock()-start;
        msec=diff*1000/CLOCKS_PER_SEC;
        printf("Time taken %d seconds %d milliseconds\n", msec/1000, msec%1000);
        fclose(fr);
        fclose(fw);
        free(x);
    }
    

    Why this is happening? I.e if fread is actually multiple calls to fgetc then why the speed difference? EDIT: specified that "increasing number of bytes" refers to the variable st in the second code

  • R.. GitHub STOP HELPING ICE
    R.. GitHub STOP HELPING ICE over 12 years
    +1 for "as if". You might improve the answer by explaining how the "as if rule" applies across everything in the C language.
  • R.. GitHub STOP HELPING ICE
    R.. GitHub STOP HELPING ICE over 12 years
    Not related to OP's question.
  • wildplasser
    wildplasser over 12 years
    Yes it is. It is likely that the op will test both versions in a short timespan. Th first run will prime the cache, the second can expect the file to be completely buffered in OS buffers, so there will be no I/O needed for the operation.
  • sehe
    sehe over 12 years
    @R.. that depends a lot on how you interpret it. I don't think the question is overly clear, but I think the OP is doing increasingly large volume benchmark, but seeing inreasingly shorter runtimes. This clearly spells cache effects to me. Anyways, since FS benchmarking is hard and the OP doesn't come forward with the steps taken to prevent wrong results, I'm inclined to assume the OP doesn't know it.
  • wildplasser
    wildplasser over 12 years
    Well given the silly feof() usage, and his preoccupation with performance, it is pretty clear to me that the OP does not now what he is doing, And the fact that R. gets upvoted for the comment above indicates that the OP is not alone.
  • sehe
    sehe over 12 years
    @R..: Depending on what filesystem implementation there is, I think the behaviour may very well be unbuffered. I have seen some pretty pathetic fgetc benchmarks with FUSE filesystsem built on top of the fuse low-level interfaces.
  • sehe
    sehe over 12 years
    @wildplasser: to be fair, the other answer is not without merit. There are many things at play. Filesystem benchmarking is hard :) So both answers are very much related to the OP's question, IMO
  • R.. GitHub STOP HELPING ICE
    R.. GitHub STOP HELPING ICE over 12 years
    The userspace libc stdio library has no relation whatsoever to FUSE or the underlying filesystem/device being accessed. Unless buffering is disabled, it will always operate in fully-buffered mode. Even if buffering is disabled, a non-pathologically-bad fread implementation will never repeatedly call fgetc but will perform a single read operation for the portion of the requested length that can't be obtained from the existing userspace buffer.
  • R.. GitHub STOP HELPING ICE
    R.. GitHub STOP HELPING ICE over 12 years
    On any modern system, the file you're testing will be fully cached in memory (by the kernel) regardless of which read method is being used. OP's question is about stdio and fread versus fgetc, not filesystem cache issues.
  • Matthew Fitzpatrick
    Matthew Fitzpatrick about 12 years
    Thanks, this makes a lot sense, and I suppose that was a little stupid of me to use fopen haha... I did have a follow up question to your answer though: You said you can make read(2) system call of arbitrary size by setting that size in the call but you would get diminishing returns. What exactly do you mean by that? How would things go wrong or less efficient?
  • Zan Lynx
    Zan Lynx about 12 years
    @MatthewFitzpatrick: There is always a point where the cost of the IO and memory copy are so much more expensive than the system call that it makes no difference in how many bytes you read.
  • Perry
    Perry about 12 years
    Reducing the number of system calls improves performance by doing things like lowering the number of system context switches, but once you have decimated that by a factor of a thousand or four thousand, the amount of improved performance goes down. Your program is probably not dominated by read system calls at that point. Things won't go wrong, it is just that performance won't improve significantly past some point -- the actual I/O will start to dominate the time taken by read(2) rather than the system call overhead once you get past a certain size.
  • R.. GitHub STOP HELPING ICE
    R.. GitHub STOP HELPING ICE about 12 years
    I think he Perry meant the difference in performance between fread and read should approach zero as the size of the block being read approaches infinity. On a good stdio implementation, this is definitely true, but a bad one might always be slower in fread if it first reads the data through the buffer in many small underlying reads and copies them to the caller's buffer...
  • Perry
    Perry about 12 years
    @R.., I was more answering the "why would increasing the size of read(2) calls encounter diminishing returns". Your comment is also correct, of course -- in general, a stdio implementation will use a fixed size buffer, so an fread of a huge chunk much larger than the buffer size will not be as efficient as the comparable read(2) call.
  • R.. GitHub STOP HELPING ICE
    R.. GitHub STOP HELPING ICE about 12 years
    Well, I would seriously question the viability of a stdio implementation that forces all data to go through the buffer even when it could be read directly to (or written directly from) the caller's buffer, just like I would question the viability of one that repeatedly called fgetc rather than just behaving "as if" by calling it repeatedly -- but apparently some such bad implementations do exist! :-)
  • Perry
    Perry about 12 years
    It is hard for an implementation to avoid the buffer in the general case, since part of the data needed to satisfy the read may already be in the buffer. I haven't looked lately at the implementations of fread that are out there in BSD and Linux -- you may be correct that after copying the buffer that they satisfy the next part of the read directly into the user's buffer. Of course, the standard does not require that, but I don't know what the real practice is without looking. But we are now far afield of @MatthewFitzpatrick's original question...