Win32 Overlapped I/O - Completion routines or WaitForMultipleObjects?

14,406

Solution 1

You suggest two methods of doing overlapped I/O and ignore the third (or I'm misunderstanding your question).

When you issue an overlapped operation, a WSARecv() for example, you can specify an OVERLAPPED structure which contains an event and you can wait for that event to be signalled to indicate the overlapped I/O has completed. This, I assume, is your WaitForMultipleObjects() approach and, as previously mentioned, this doesn't scale well as you're limited to the number of handles that you can pass to WaitForMultipleObjects().

Alternatively you can pass a completion routine which is called when completion occurs. This is known as 'alertable I/O' and requires that the thread that issued the WSARecv() call is in an 'alertable' state for the completion routine to be called. Threads can put themselves in an alertable state in several ways (calling SleepEx() or the various EX versions of the Wait functions, etc). The Richter book that I have open in front of me says "I have worked with alertable I/O quite a bit, and I'll be the first to tell you that alertable I/O is horrible and should be avoided". Enough said IMHO.

There's a third way, before issuing the call you should associate the handle that you want to do overlapped I/O on with a completion port. You then create a pool of threads which service this completion port by calling GetQueuedCompletionStatus() and looping. You issue your WSARecv() with an OVERLAPPED structure WITHOUT an event in it and when the I/O completes the completion pops out of GetQueuedCompletionStatus() on one of your I/O pool threads and can be handled there.

As previously mentioned, Vista/Server 2008 have cleaned up how IOCPs work a little and removed the problem whereby you had to make sure that the thread that issued the overlapped request continued to run until the request completed. Link to a reference to that can be found here. But this problem is easy to work around anyway; you simply marshal the WSARecv over to one of your I/O pool threads using the same IOCP that you use for completions...

Anyway, IMHO using IOCPs is the best way to do overlapped I/O. Yes, getting your head around the overlapped/async nature of the calls can take a little time at the start but it's well worth it as the system scales very well and offers a simple "fire and forget" method of dealing with overlapped operations.

If you need some sample code to get you going then I have several articles on writing IO completion port systems and a heap of free code that provides a real-world framework for high performance servers; see here.

As an aside; IMHO, you really should read "Windows Via C/C++ (PRO-Developer)" by Jeffrey Richter and Christophe Nasarre as it deals will all you need to know about overlapped I/O and most other advanced windows platform techniques and APIs.

Solution 2

WaitForMultipleObjects is limited to 64 handles; in a highly concurrent application this could become a limitation.

Completion ports fit better with a model of having a pool of threads all of which are capable of handling any event, and you can queue your own (non-IO based) events into the port, whereas with waits you would need to code your own mechanism.

However completion ports, and the event based programming model, are a more difficult concept to really work against.

I would not expect any significant performance difference, but in the end you can only make your own measurements to reflect your usage. Note that Vista/Server2008 made a change with completion ports that the originating thread is not now needed to complete IO operations, this may make a bigger difference (see this article by Mark Russinovich).

Solution 3

Table 6-3 in the book Network Programming for Microsoft Windows, 2nd Edition compares the scalability of overlapped I/O via completion ports vs. other techniques. Completion ports blow all the other I/O models out of the water when it comes to throughput, while using far fewer threads.

Share:
14,406
Admin
Author by

Admin

Updated on June 15, 2022

Comments

  • Admin
    Admin almost 2 years

    I'm wondering which approach is faster and why ?

    While writing a Win32 server I have read a lot about the Completion Ports and the Overlapped I/O, but I have not read anything to suggest which set of API's yields the best results in the server.

    Should I use completion routines, or should I use the WaitForMultipleObjects API and why ?

  • Admin
    Admin about 15 years
    Do you have a reference to the change in Vista/Server2008 so I can read about it?
  • Len Holgate
    Len Holgate about 15 years
    Reference is linked to from my blog here: lenholgate.com/archives/000763.html
  • Admin
    Admin about 15 years
    WaitForMultipleObjects() can handle more than 64 objects. An extension now exists. Not sure when it came it - might be XP, might be Vista.
  • Richard
    Richard about 15 years
    @Blank: That would require a new API because of the use of constants in the API. MAXIMUM_WAIT_OBJECTS is 64, and is the max for both WaitForMultipleObjects and ...Ex (and is derived from WAIT_ABANDONED_0-WAIT_OBJECT_0).
  • kevinthompson
    kevinthompson over 13 years
    Very helpful answer. Solved my problem described at stackoverflow.com/questions/4015220/…. Thanks!
  • DangerMouse
    DangerMouse over 13 years
    Clearly shows that I/O completion has the best performance which makes sense. I'd be interested to see how scatter / gather I/O helps.
  • Len Holgate
    Len Holgate about 13 years
    Show me a WaitForMultipleObjects() based server that can service 10,000 concurrent connections and that is 'cleaner and more thread safe' than the equivalent IOCP server... Then we can profile them and see which is actually faster ;)
  • WhozCraig
    WhozCraig over 11 years
    No doubt. IOCP scales to ridiculous numbers. So much so I used it for tasks unrelated to primary I/O (work-crew thread pools,etc) =)
  • Eugene Ryabtsev
    Eugene Ryabtsev about 10 years
    About that Richter book: no, your quote is not nearly enough. Later in that same book he points two issues he was concerned about. The first is that callback functions do not have enough contextual info to do anything meaningful to continue processing the requests and that he had to resort to global variables. Well, this is awful. He should have passed a pointer to the problem context in hEvent of that OVERLAPPED structure, we're not Win98 anymore, thank you very much (although it might work even then, I have not checked).
  • Eugene Ryabtsev
    Eugene Ryabtsev about 10 years
    And this "problem" is even less of a problem with socket IO where both offset fields are free for use. He also mentions some "threading issues" (you have a single-threaded application this way, with no load balancing). Well, as often as not, you can starve the system with doing IO from just one thread with no need for more (or half a thread, for that matter), you can post tasks to CPU-bound worker threads, etc. Nothing is wrong with alertable IO unless you write a dedicated server (in which case you go IOCP).
  • Lothar
    Lothar over 8 years
    Richter is a good teacher and writer but not a smart programmer. Compeltition routines are great tools and work well.
  • str14821
    str14821 over 6 years
    This answer is provided by a developer with a solid windows API knowledge. Care, third way of performing async I/O is not for beginners, I wouldn't even say it's for middle developers.