[libdispatch-dev] Fully functional libdispatch port for Windows and Linux

Sat Feb 19 16:13:07 PST 2011

On Fri, Feb 11, 2011 at 1:24 PM, Kevin Van Vechten <kvv at apple.com> wrote:

> Hi Marius,
>
> It's great that you have enough interest in the libdispatch project to work
> on a port to Linux and Win32. If you weren't aware, we already have Linux
> and partial Win32 support in the primary libdispatch source repository. I
> would encourage you to contribute modifications to the primary project
> rather than attempt a complete rewrite.
>
> I reviewed some of the sources of "libxdispatch" and I can see that this
> does approximate the libdisptach API at a high level, but it's important to
> keep in mind some of the aspects of the libdispatch implementation that set
> it apart — specifically thread safety and wait-free algorithms — which allow
> it to scale well with large numbers of cores and heavy load. I noted several
> places in the libxdispatch/core/queue.c implementation where thread safety
> was questionable and other cases where simple spin locks (waiting) were in
> use.
>
> I'd be happy to answer any specific questions you may have about the
> libdispatch implementation or scalability goals.

There's one part of this message that caught my eye, and I'm particularly
curious about:

but it's important to keep in mind some of the aspects of the libdispatch
> implementation that set it apart — specifically thread safety and wait-free
> algorithms — which allow it to scale well with large numbers of cores and
> heavy load.

Specifically, the "wait-free algorithms" part.  To be clear, "wait-free" may
mean different things to different people, but in high concurrency circles,
"wait-free" tends to have a very specific meaning.  See the wikipedia
article http://en.wikipedia.org/wiki/Non-blocking_algorithm for the
definition(s) I'm using.

Is libdispatch truly "wait-free"?  I looked at the code when it first came
out, and nothing in particular struck me as rising to the level of
"wait-free".  Obstruction-free, sure.  "Practical" lock-free, most likely.
 Wait-free?  I don't see how that's possible.  From the wikipedia article
wrt/ wait-free:

An algorithm is wait-free if every operation has a bound on the number of
> steps the algorithm will take before the operation completes.

This basically eliminates anything that uses a CAS or LL/SC primitive with
"loop and retry on failure" logic- there is no guarantee that the operation
will ever complete.  If I recall correctly, libdispatch uses this technique
to implement it's lockless data structures.  Even ensuring (for some very
strong definition of ensuring, like provably) "lock-free" behavior with this
kind of logic is extremely difficult, hence the informal term "practical"
lock-free.  In practice, truly lock-free and wait-free almost requires
access to a Verilog description (or equivalent) of all the components and
doing very low level simulation to ensure all possible permutations are
shown to "work".  As you can imagine, this just isn't practical for a
modern, out of order, super scalar CPU, let alone everything it connects to.
 To the best of my knowledge, the x86 architecture is one of the few out
there that has a CAS operation the ensures "at least one succeeds" semantics
(which I believe is largely due to the requirement to be "mostly" backwards
compatible all the way back to 8086, particularly on bus transactions).
 Most other architectures do not. In fact, many implementations of LL/SC can
result in a live-lock condition in extremely rare circumstances (it's
happened on IBM big iron, for example).

Is this just a case where a better choice of words would have been more
appropriate?  Because a truly wait-free implementation would be quite a
coup, and I'd appreciate any pointers to the details of how you did this.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.macosforge.org/pipermail/libdispatch-dev/attachments/20110219/ca4d1afc/attachment.html>