libdispatch for Win32
So, I've got the basics working tolerably well. Is this something that people care about/want source for? What I've done so far is as follows: Get most of the code building properly in VC++ 2010: * Replace C99 named initializers with old-fashioned aggregate initializers. * Replace gcc typeof with real type names. * Replace gcc's fancy macros with less fancy standard C89 ones. * VC++ has no equivalent (AFAIK) to gcc's transparent_union, so insert casts as necessary. * Minor bits and pieces like get rid of the ?: gccism. * Provide minimal Win32 equivalents to missing UNIX headers that seem necessary. Blocks: * Only Microsoft is in a position to produce built-in block support for VC++ and I'm sure as hell not going to write a source-source translator. Instead, I have a C++ lambda wrapper that works in conjunction with the _f function variants. This seems more than enough for most purposes. Port pthread_workqueues to Win32: * Built on top of "new-style" (Vista and up) Win32 threadpools. * Reasonably complete. * Reasonably inadequately tested. Rework dispatch_sources: * The Windows overlapped I/O model is better than the traditional UNIX one, but doesn’t readily support: DISPATCH_SOURCE_TYPE_READ DISPATCH_SOURCE_TYPE_WRITE * And Windows in general doesn't have any good analogues to: DISPATCH_SOURCE_TYPE_VNODE DISPATCH_SOURCE_TYPE_SIGNAL DISPATCH_SOURCE_TYPE_PROC DISPATCH_SOURCE_TYPE_MACH_RECV DISPATCH_SOURCE_TYPE_MACH_SEND * But what I do have instead is initial support DISPATCH_SOURCE_TYPE_OIO (for "overlapped I/O") example: https://gist.github.com/938097 * Overlapped I/O supports files, sockets, named pipes, and more. All of these need testing. * The loss of READ/WRITE/SIGNAL/MACH_* is no big deal on Windows, as they don't really fit into Win32 anyway. Only one PROC feature (EXIT) translates into Win32, and I'm not seeing any clearly compelling reason to replicate it, as it doesn't seem especially useful. However, the loss of VNODE is unfortunate, as it both have interesting features. But this may not be fatal. ReadDirectoryChangeNotificationsW supports overlapped I/O, so should plumb into my existing DISPATCH_SOURCE_TYPE_OIO with no changes anyway, though the interface will not be quite as tidy. * I still need to test: DISPATCH_SOURCE_TYPE_DATA_ADD DISPATCH_SOURCE_TYPE_DATA_OR * I need to fix: DISPATCH_SOURCE_TYPE_TIMER I ripped out part of the machinery to make the code clearer temporarily; now I need to add it back. * I also need to examine the ins and outs of cancellation and suspension and so on and so forth. Win32 doesn't allow a handle to be detached from an IOCP except by closing the handle, so there are some sadnesses there. I'm not sure how much impact they'll have in practice, possibly none. * I am toying with the idea of something along the lines of DISPATCH_SOURCE_TYPE_WAITABLE, which would perform a wait on any Win32 waitable object (so mutex, event, waitable timer handles, amongst others) and dispatch a message to a queue when that wait occurs. This would give us back the one PROC scenario that makes sense in Win32, too, as you can wait on a process handle, and the wait resumes when the process terminates. Due to the annoying traits of WaitForMultipleObjects (it's limited to 64 HANDLEs), however, this might be a little awkward to implement without moving to a rather wasteful thread-per-wait model. * There are almost certainly memory leaks, bugs, etc.. My focus has been on validating the general approach more than writing a bunch of test cases. Main queues: * From my understanding of Mac OS X, Windows has no real meaningful equivalent of Cocoa's blessed main queue. Any thread can have a message pump and associated windows, which it's then responsible for drawing, etc.. Processes have an M:N model (M threads, controlling N windows), with each window being affinitized to its own thread. * However, the ability to post a message back into a window's message loop is obviously invaluable, so I want to create as close a workalike as makes sense in Win32. Something that captures the spirit, if not the exact same API. I've not yet written any code for this, but my plan of action is to do something along these lines: 1) allow creation of serial queues bound to an HWND or HWNDs. Callbacks posted to these queues will be pumped into the WndProc one-by-one. 2) Either a WndProc hook or a helper function (or both) to respond to the callbacks posted to the WndProc and execute them. 3) Possibly some convenience helpers to allow the retrieval of a queue given an HWND and so on. Distant future: * There would be certain benefits to ripping out the pthread_workqueues and using the new-style Win32 threadpools directly. The Win32 threadpools directly support timers, so they might allow DISPATCH_SOURCE_TYPE_TIMER to be moved off the dispatch_mgr queue/thread. Likewise, they directly support overlapped I/O, so might allow DISPATCH_SOURCE_TYPE_OIO to be moved off the dispatch_mgr thread too. They also directly support waits on handles, which would greatly simplify DISPATCH_SOURCE_TYPE_WAITABLE (if I do indeed go down that route). So the advantages would be many--but I am wary of diverging too far from the existing source, which is why thus far I've implemented pthread_workqueues instead; it was the easy solution. * Going in the opposite direction, some might prefer switching to Windows 2000-style thread pools, so as to support Windows XP instead. This would work (and I think someone on the list mentioned that they had implemented pthread_workqueue on that API already), but it also means eliminating the possibility of the streamlined implementations described above. Source code: My plan was to dump it into my github, if people find the whole thing interesting, though I was going to wait until I'd fixed timer sources, since they're rather important. Peter
Hey Peter, It is great to discover that much interest into porting libDispatch to Windows lately. As you might have read while browsing the archives of this mailing list, I am working on a win32 port as well - and have already done just the same as you did. Thanks to the help of Brent Fulgham we can build on MSVC as well. The idea of using C++0x lambdas as a workaround for missing blocks support occured to me too - seems as if the number of similarities between our two ports is not going to end soon. As such I would consider it odd if we spent time and energy (as already happened far too much) into maintaining and developing two windows variants of libdispatch. I'd love to merge our two source trees, just have a look at mine by going to http://opensource.mlba-team.de/svn/xdispatch/trunk/core/ or opensource.mlba-team.de/xdispatch for more excessive documentation. I - too - have concurrent and serial queues working and I am currently fixing the timers on windows. Please note my annotations to your ideas below. Sincerely, Marius
On Sat, 23 Apr 2011 03:07:19 +0000, DrPizza wrote:
So, I've got the basics working tolerably well. Is this something that people care about/want source for?
What I've done so far is as follows:
Get most of the code building properly in VC++ 2010: * Replace C99 named initializers with old-fashioned aggregate initializers. * Replace gcc typeof with real type names. * Replace gcc's fancy macros with less fancy standard C89 ones. * VC++ has no equivalent (AFAIK) to gcc's transparent_union, so insert casts as necessary. * Minor bits and pieces like get rid of the ?: gccism.
* Provide minimal Win32 equivalents to missing UNIX headers that seem
necessary. Please see the shims folder within my source tree
Blocks:
* Only Microsoft is in a position to produce built-in block support for VC++ and I'm sure as hell not going to write a source-source translator. Instead, I have a C++ lambda wrapper that works in conjunction with the _f function variants. This seems more than enough for most purposes.
I moved my lambda implementation into xdispatch, in order to keep libdispatch as pure c library. It really seems to work well.
Port pthread_workqueues to Win32: * Built on top of "new-style" (Vista and up) Win32 threadpools. * Reasonably complete. * Reasonably inadequately tested.
Is there a reason for you not using the readily ported libpthread_workqueue of Mark Heily? By using the older style threadpool you can achieve a broader compatibility as windows xp still seems quite familiar among users.
Rework dispatch_sources: * The Windows overlapped I/O model is better than the traditional UNIX one, but doesn't readily support:
DISPATCH_SOURCE_TYPE_READ
DISPATCH_SOURCE_TYPE_WRITE * And Windows in general doesn't have any good analogues to:
DISPATCH_SOURCE_TYPE_VNODE
DISPATCH_SOURCE_TYPE_SIGNAL
DISPATCH_SOURCE_TYPE_PROC
DISPATCH_SOURCE_TYPE_MACH_RECV
DISPATCH_SOURCE_TYPE_MACH_SEND
* But what I do have instead is initial support DISPATCH_SOURCE_TYPE_OIO (for "overlapped I/O") example: https://gist.github.com/938097 [1] * Overlapped I/O supports files, sockets, named pipes, and more. All of these need testing. * The loss of READ/WRITE/SIGNAL/MACH_* is no big deal on Windows, as they
don't really fit into Win32 anyway. Only one PROC feature (EXIT)
translates into Win32, and I'm not seeing any clearly compelling reason
to replicate it, as it doesn't seem especially useful. However, the loss of VNODE is unfortunate, as it both have interesting features. But this may not be fatal. ReadDirectoryChangeNotificationsW supports overlapped I/O, so should plumb into my existing DISPATCH_SOURCE_TYPE_OIO with no changes anyway, though the interface will not be quite as tidy. * I still need to test:
DISPATCH_SOURCE_TYPE_DATA_ADD
DISPATCH_SOURCE_TYPE_DATA_OR * I need to fix: DISPATCH_SOURCE_TYPE_TIMER I ripped out part of the machinery to make the code clearer temporarily; now I need to add it back. * I also need to examine the ins and outs of cancellation and suspension and so on and so forth. Win32 doesn't allow a handle to be detached from an IOCP except by closing the handle, so there are some sadnesses there. I'm not sure how much impact they'll have in practice, possibly none. * I am toying with the idea of something along the lines of DISPATCH_SOURCE_TYPE_WAITABLE, which would perform a wait on any Win32 waitable object (so mutex, event, waitable timer handles, amongst others) and dispatch a message to a queue when that wait occurs. This would give us back the one PROC scenario that makes sense in Win32, too, as you can wait on a process handle, and the wait resumes when the process terminates. Due to the annoying traits of WaitForMultipleObjects (it's limited to 64 HANDLEs), however, this might be a little awkward to implement without moving to a rather wasteful thread-per-wait model. * There are almost certainly memory leaks, bugs, etc.. My focus has been on validating the general approach more than writing a bunch of test cases.
Did you re-implement your own version of kqueues or completely exchange the kevent etc. calls within your source code? DISPATCH_SOURCE_TYPE_OIO sounds interesting, as Mark and I already discussed a similar solution "as the way to go" on windows.
Main queues: * From my understanding of Mac OS X, Windows has no real meaningful equivalent of Cocoa's blessed main queue. Any thread can have a message pump and associated windows, which it's then responsible for drawing, etc.. Processes have an M:N model (M threads, controlling N windows), with each window being affinitized to its own thread. * However, the ability to post a message back into a window's message loop is obviously invaluable, so I want to create as close a workalike as makes sense in Win32. Something that captures the spirit, if not the exact same API. I've not yet written any code for this, but my plan of action is to do something along these lines: 1) allow creation of serial queues bound to an HWND or HWNDs. Callbacks posted to these queues will be pumped into the WndProc one-by-one. 2) Either a WndProc hook or a helper function (or both) to respond to the callbacks posted to the WndProc and execute them. 3) Possibly some convenience helpers to allow the retrieval of a queue given an HWND and so on. Distant future: * There would be certain benefits to ripping out the pthread_workqueues and using the new-style Win32 threadpools directly. The Win32 threadpools directly support timers, so they might allow DISPATCH_SOURCE_TYPE_TIMER to be moved off the dispatch_mgr queue/thread. Likewise, they directly support overlapped I/O, so might allow DISPATCH_SOURCE_TYPE_OIO to be moved off the dispatch_mgr thread too. They also directly support waits on handles, which would greatly simplify DISPATCH_SOURCE_TYPE_WAITABLE (if I do indeed go down that route). So the advantages would be many--but I am wary of diverging too far from the existing source, which is why thus far I've implemented pthread_workqueues instead; it was the easy solution. * Going in the opposite direction, some might prefer switching to Windows 2000-style thread pools, so as to support Windows XP instead. This would work (and I think someone on the list mentioned that they had implemented pthread_workqueue on that API already), but it also means eliminating the possibility of the streamlined implementations described above.
I have to disagree. By using RegisterWaitForSingleObject on a timer handle you can easily achieve similar behaviour using the "old" thread pool api without needing an additional manager thread.
Source code: My plan was to dump it into my github, if people find the whole thing interesting, though I was going to wait until I'd fixed timer sources, since they're rather important.
Peter
That would be interesting although I hope we can merge our efforts within the near future. Links: ------ [1] https://gist.github.com/938097
Is there a reason for you not using the readily ported libpthread_workqueue of Mark Heily? By using the older style threadpool you can achieve a broader compatibility as windows xp still seems quite familiar among users.
My main reasons for using the new threadpool API: 1) I actually did the work a year ago; I then went on holiday and on my return got distracted by other projects. At the time, I could find no reasonable Win32 implementation. 2) It can be used robustly, whereas the old one cannot; the old one provides no way of properly handling out-of-resource situations. 3) It allows multiple pools per process, which allows libdispatch's pools to be relatively isolated from any others that the application might create. This seems to reduce the possibility of surprises. 4) The old threadpool API lacks any effective way of tidying up, in particular preventing callbacks from safely performing such tasks as unloading the DLL they are running from, and having no way to ensure that every callback is safely executed or deallocated. 5) There did not seem any obvious way to implement e.g. pthread_workqueue_removeitem_np using old-style threadpools. 6) Timer queue timers have no leeway facility. 7) The first project that I wish to use GCD in is already incompatible with Windows XP, so compatibility with that decade-old platform was of no value to me. The new threadpool API seems to me to be a much better basis for development. That is not to say that the old API has no merits: 1) It's much simpler 2) BindIoCompletionCallback is a nicer mechanism; new-style threadpools require StartThreadpoolIo to be called prior to every single overlapped I/O operation, which I find unacceptably invasive.
Did you re-implement your own version of kqueues or completely exchange the kevent etc. calls within your source code? DISPATCH_SOURCE_TYPE_OIO sounds interesting, as Mark and I already discussed a similar solution "as the way to go" on windows.
I have replaced kqueues with an I/O Completion Port mechanism. I actually retain 'struct kevent', because it's fairly thoroughly embedded in the dispatch_source mechanics, and is useful simply as a tuple of information relevant to each dispatch_source; there seemed little value in replacing it with something neutral.
I have to disagree. By using RegisterWaitForSingleObject on a timer handle you can easily achieve similar behaviour using the "old" thread pool api without needing an additional manager thread.
If I am to retain the overall form of current libdispatch--which means abstracting away the true threadpool API and instead calling pthread_workqueue functions--then I am disinclined to write code that directly uses the underlying threadpool through the back door. And if I were to strip out pthread_workqueues entirely then I would not want to use RegisterWaitForSingleObject due to my aforementioned dislike for the old threadpool API.
That would be interesting although I hope we can merge our efforts within the near future. Certainly, there's no point in duplicating effort if we decide to take similar paths.
FYI, my code is available at: https://github.com/DrPizza/libdispatch From: Marius Zwicker [mailto:marius@mlba-team.de] Sent: 23 April 2011 06:42 To: DrPizza Cc: libdispatch-dev@lists.macosforge.org Subject: Re: [libdispatch-dev] libdispatch for Win32 Hey Peter, It is great to discover that much interest into porting libDispatch to Windows lately. As you might have read while browsing the archives of this mailing list, I am working on a win32 port as well - and have already done just the same as you did. Thanks to the help of Brent Fulgham we can build on MSVC as well. The idea of using C++0x lambdas as a workaround for missing blocks support occured to me too - seems as if the number of similarities between our two ports is not going to end soon. As such I would consider it odd if we spent time and energy (as already happened far too much) into maintaining and developing two windows variants of libdispatch. I'd love to merge our two source trees, just have a look at mine by going to http://opensource.mlba-team.de/svn/xdispatch/trunk/core/ or opensource.mlba-team.de/xdispatch for more excessive documentation. I - too - have concurrent and serial queues working and I am currently fixing the timers on windows. Please note my annotations to your ideas below. Sincerely, Marius
On Sat, 23 Apr 2011 03:07:19 +0000, DrPizza wrote:
So, I've got the basics working tolerably well. Is this something that people
care about/want source for?
What I've done so far is as follows:
Get most of the code building properly in VC++ 2010:
* Replace C99 named initializers with old-fashioned aggregate initializers.
* Replace gcc typeof with real type names.
* Replace gcc's fancy macros with less fancy standard C89 ones.
* VC++ has no equivalent (AFAIK) to gcc's transparent_union, so insert
casts as necessary. * Minor bits and pieces like get rid of the ?: gccism.
* Provide minimal Win32 equivalents to missing UNIX headers that seem
necessary.
Please see the shims folder within my source tree
Blocks:
* Only Microsoft is in a position to produce built-in block support for
VC++ and I'm sure as hell not going to write a source-source translator.
Instead, I have a C++ lambda wrapper that works in conjunction with the
_f function variants. This seems more than enough for most purposes.
I moved my lambda implementation into xdispatch, in order to keep libdispatch as pure c library. It really seems to work well.
Port pthread_workqueues to Win32:
* Built on top of "new-style" (Vista and up) Win32 threadpools.
* Reasonably complete.
* Reasonably inadequately tested.
Is there a reason for you not using the readily ported libpthread_workqueue of Mark Heily? By using the older style threadpool you can achieve a broader compatibility as windows xp still seems quite familiar among users.
Rework dispatch_sources:
* The Windows overlapped I/O model is better than the traditional UNIX one,
but doesn’t readily support:
DISPATCH_SOURCE_TYPE_READ
DISPATCH_SOURCE_TYPE_WRITE
* And Windows in general doesn't have any good analogues to:
DISPATCH_SOURCE_TYPE_VNODE
DISPATCH_SOURCE_TYPE_SIGNAL
DISPATCH_SOURCE_TYPE_PROC
DISPATCH_SOURCE_TYPE_MACH_RECV
DISPATCH_SOURCE_TYPE_MACH_SEND
* But what I do have instead is initial support
DISPATCH_SOURCE_TYPE_OIO (for "overlapped I/O")
example: https://gist.github.com/938097
* Overlapped I/O supports files, sockets, named pipes, and more. All of
these need testing.
* The loss of READ/WRITE/SIGNAL/MACH_* is no big deal on Windows, as they
don't really fit into Win32 anyway. Only one PROC feature (EXIT)
translates into Win32, and I'm not seeing any clearly compelling reason
to replicate it, as it doesn't seem especially useful. However, the loss
of VNODE is unfortunate, as it both have interesting features. But this
may not be fatal. ReadDirectoryChangeNotificationsW supports overlapped
I/O, so should plumb into my existing DISPATCH_SOURCE_TYPE_OIO with no
changes anyway, though the interface will not be quite as tidy.
* I still need to test:
DISPATCH_SOURCE_TYPE_DATA_ADD
DISPATCH_SOURCE_TYPE_DATA_OR
* I need to fix:
DISPATCH_SOURCE_TYPE_TIMER
I ripped out part of the machinery to make the code clearer temporarily;
now I need to add it back.
* I also need to examine the ins and outs of cancellation and suspension
and so on and so forth. Win32 doesn't allow a handle to be detached from
an IOCP except by closing the handle, so there are some sadnesses there.
I'm not sure how much impact they'll have in practice, possibly none.
* I am toying with the idea of something along the lines of
DISPATCH_SOURCE_TYPE_WAITABLE, which would perform a wait on any Win32
waitable object (so mutex, event, waitable timer handles, amongst others)
and dispatch a message to a queue when that wait occurs. This would give
us back the one PROC scenario that makes sense in Win32, too, as you can
wait on a process handle, and the wait resumes when the process
terminates. Due to the annoying traits of WaitForMultipleObjects (it's
limited to 64 HANDLEs), however, this might be a little awkward to
implement without moving to a rather wasteful thread-per-wait model.
* There are almost certainly memory leaks, bugs, etc.. My focus has been on
validating the general approach more than writing a bunch of test cases.
Did you re-implement your own version of kqueues or completely exchange the kevent etc. calls within your source code? DISPATCH_SOURCE_TYPE_OIO sounds interesting, as Mark and I already discussed a similar solution "as the way to go" on windows.
Main queues:
* From my understanding of Mac OS X, Windows has no real meaningful
equivalent of Cocoa's blessed main queue. Any thread can have a message
pump and associated windows, which it's then responsible for drawing,
etc.. Processes have an M:N model (M threads, controlling N windows),
with each window being affinitized to its own thread.
* However, the ability to post a message back into a window's message loop
is obviously invaluable, so I want to create as close a workalike as
makes sense in Win32. Something that captures the spirit, if not the
exact same API. I've not yet written any code for this, but my plan of
action is to do something along these lines:
1) allow creation of serial queues bound to an HWND or HWNDs.
Callbacks posted to these queues will be pumped into the WndProc
one-by-one.
2) Either a WndProc hook or a helper function (or both) to respond to
the callbacks posted to the WndProc and execute them.
3) Possibly some convenience helpers to allow the retrieval of a
queue given an HWND and so on.
Distant future:
* There would be certain benefits to ripping out the pthread_workqueues and
using the new-style Win32 threadpools directly. The Win32 threadpools
directly support timers, so they might allow DISPATCH_SOURCE_TYPE_TIMER to
be moved off the dispatch_mgr queue/thread. Likewise, they directly
support overlapped I/O, so might allow DISPATCH_SOURCE_TYPE_OIO to be
moved off the dispatch_mgr thread too. They also directly support
waits on handles, which would greatly simplify
DISPATCH_SOURCE_TYPE_WAITABLE (if I do indeed go down that route).
So the advantages would be many--but I am wary of diverging too far from
the existing source, which is why thus far I've implemented
pthread_workqueues instead; it was the easy solution. * Going in the opposite direction, some might prefer switching to Windows
2000-style thread pools, so as to support Windows XP instead. This would
work (and I think someone on the list mentioned that they had implemented
pthread_workqueue on that API already), but it also means eliminating the
possibility of the streamlined implementations described above.
I have to disagree. By using RegisterWaitForSingleObject on a timer handle you can easily achieve similar behaviour using the "old" thread pool api without needing an additional manager thread.
Source code:
My plan was to dump it into my github, if people find the whole thing
interesting, though I was going to wait until I'd fixed timer sources,
since they're rather important.
Peter
That would be interesting although I hope we can merge our efforts within the near future.
Replying to myself...
Rework dispatch_sources: * The Windows overlapped I/O model is better than the traditional UNIX one, but doesn’t readily support: DISPATCH_SOURCE_TYPE_READ DISPATCH_SOURCE_TYPE_WRITE * But what I do have instead is initial support DISPATCH_SOURCE_TYPE_OIO (for "overlapped I/O") example: https://gist.github.com/938097
It was remiss of me to fail to point out that for sockets, specifically, Windows does of course allow use of select(), which could be used to provide READ/WRITE support. I could reinstate this (though it will need to go on a different thread; one to dequeue from the IOCP, one to call select), but select() is a poor API--O(n) performance, amongst other things--so this has not been a priority for me. Overlapped I/O has been my priority as it provides a uniform approach across both files and sockets, and unlike calling select() on FDs in UNIX, it is actually useful for local disks. select() called against disk FDs will always return immediately, because a file is basically always deemed to be readable and writeable (unless at EOF or similar), even if those reads and writes will, in fact, block. Overlapped I/O allows true non-blocking operations. Going the other way, one could envisage a DISPATCH_SOURCE_TYPE_AIO, for FreeBSD's AIO. This would work in a manner broadly similar to my DISPATCH_SOURCE_TYPE_OIO; in fact, I think the user interfaces for the two could be made similar, with dispatch_source_get_data returning the OVERLAPPED on Windows and aiocb on FreeBSD. The big difference is that I believe that each and every aiocb used to perform an AIO operation needs to be registered with the kqueue, so there would need to be an API to perform this registration on a call-by-call basis, in contrast to the Windows approach where I can simply register the _HANDLE_ against the IOCP. Alas, AIO isn't available on Mac OS X. I don't immediately see any effective way to do non-blocking file I/O on that platform, but perhaps I am overlooking something. Peter
participants (2)
-
DrPizza
-
Marius Zwicker