[libdispatch-dev] Updates regarding the status of libdispatch on Windows

Mon May 9 06:35:40 PDT 2011

> You may be interested to know that on Windows, libkqueue uses an IO
> completion port as the underlying mechanism for kqueue(). These functions are
> roughly equivalent:
> 
>    kqueue() == CreateIoCompletionPort()
>    kevent() == GetQueuedCompletionStatus()
> 
> When a native Windows event is recieved by one of the filters' callback
> routines, it is translated into a kevent and delivered via
> PostQueuedCompletionStatus() to the IOCP.

Yes, I have taken a look at it to see how xdispatch is implemented. I 
initially thought about taking a similar approach, and emulating the entire 
kqueue()/kevent() mechanism; it seemed over-complicated and I wasn't sure it 
would really be worthwile. At the very least, I think you would agree that 
your code is quite a lot more complex than my OIO source code [1].

The libkqueue code does resolve one of the conceptual difficulties I had--how 
best to wed the staunchly completion-oriented IOCPs to the readiness 
notification world of select(). I couldn't immediately see any way to provide 
the right (kqueue()-like) level of efficiency whilst still using select(). The 
use of a thread pool, registered waits, and WSAEventSelect() is a neat 
solution that I assume avoids select()'s inefficiency. 

I suspect also that there are some gotchas waiting to bite you. For example, 
I believe that the kevent() timers call SetWaitableTimer() on the user's 
thread. I don't think this is safe to do; if the user's thread exits, the 
timer is automatically cancelled without changing its signalled state (at 
least, that is what the documentation claims). This means that the registered 
wait will never finish and the timer events will be lost. It's possible that 
the documentation is in error and that this is only an issue when using the 
function to deliver an APC directly, rather than using it as a waitable 
object, in which case there would be no issue.

The reason things like that are a concern is that if calls are being made from 
threadpool threads, there's no good way to control their lifetimes. The thread 
might get reaped if the pool thinks that it has too many idle threads, and 
that would ruin any timers. A similar situation exists with I/O; any I/O 
operations that are outstanding can might get cancelled if the thread exits.

On the other hand, that may not matter too much, since libdispatch doesn't use 
kqueue() timers anyway.

> In addition to the platform-specific dispatch sources, I think it would be
> helpful to have a platform-agnostic libdispatch I/O subsystem. The basic idea
> would be to enqueue a block onto a dispatch queue when an I/O operation is
> completed. This is similar to IOCP on Windows, but could be implemented using
> readiness notification on Unix.

I would point you in the direction of liboio [2], which is being developed as 
an aid to producing a first-class node.js for Windows. One of its goals is to 
provide this kind of a cross-platform I/O facility, and one of the problems 
being tackled is specifically how best to construct an API that is easy to 
use, efficient, and implementable with semantics as close to identical as 
possible across a wide range of platforms. It might offer useful guidance on 
the best approach to take.

>    1. Call recv() and attempt to read from the socket.
>    2a. If the return value is positive, add the block to the dispatch queue,
> and return.
>    2b. If the return value is negative and errno is EAGAIN, then create a
> oneshot EVFILT_READ kevent for the socket descriptor. Associate this with a
> block that will perform the recv(), and then enqueue the user-supplied block.
> 
> On Windows, dispatch_recv() would be implemented like this:
> 
>    1. Add a oneshot EVFILT_IOCP kevent with the SOCKET as the ident.
>    2. Set the handler to call the user-defined block when the kevent is
> recieved.
>    3. Call WSARecv() using overlapped IO.

Is this enough to provide a real ability to write cross-platform code? It feels 
to me like the POSIX code could produce truncated reads in non-error 
circumstances, whereas the overlapped I/O will generlaly give a truncated I/O 
if there is an error.

> Here's an example of how someone might use dispatch_read() to read some data
> from STDIN.
If memory serves, you can't perform overlapped I/O on Windows' console handles, 
so this might actually be rather hard to achieve--you have to do blocking reads 
in background threads instead.

Peter

[1]: https://github.com/DrPizza/libdispatch/blob/master/libdispatch/src/queue_kevent.c#L100
[2]: https://github.com/joyent/liboio