[libdispatch-dev] libdispatch's thread model

Thu Jan 28 07:07:28 PST 2010

Hi,

On 27.01.2010 17:43, Daniel A. Steffen wrote:
> On Jan 27, 2010, at 8:01 AM, Mario Schwalbe wrote:
>> There's similar code in _dispatch_root_queues_init() for the implementation
>> using pthread workqueues, but disabled. Why?
>
> that code is in fact not intended for the pthread workqueue implementation
> (which never looks at dgq_thread_pool_size) but for the pthread-pool implementation,
> it was mistakenly included inside the #if HAVE_PTHREAD_WORKQUEUES, it should be
> moved out, c.f. below
>
> The reason it is disabled is explained by the comment... the kernel workqueue mechanism
> has the ability to create new worker threads when all existing threads for a workqueue
> are blocked, ...

This sounds like the main advantage of work queues over thread pools, because blocked
threads that release the processor allow other jobs in the queue to be run instead.

However, if libdispatch isn't able to create new worker threads when existing threads
block (I assume due to I/O), only the overall completion time of a set of jobs will
increase. This isn't that bad for tasks that use libdispatch to perform asynchronous
computations and use dispatch sources to respond to I/O.

@Mark Heily: Is the apache thread pool implementation, you suggested, capable of
creating new threads? I'm asking, because I didn't find any documentation that briefly
describes the exact semantics. If it is not, I see no benefit in implementing
a work queue API, that isn't better than the existing thread pool.

> ... the pthread-pool implementation does not, so a non-overcommitting queue  with
> a limited pool size can result in deadlocks for client code that expects to be
> able to submit many concurrently executing interdependent workitems that block.

How?

1. Let's assume someone manages to submit n jobs that block each other causing a
deadlock on a machine having n processors. The thread pool implementation isn't able
to execute any other jobs anymore, so the application can be considered erroneous/dead.
The work queue implementation is still able to execute jobs waiting in the queue, so
the application (as a whole) can still make some progress, but cannot finish either,
because - assuming the results of those blocked jobs are important - it at some point
has to wait for their results (dispatch_group_wait()) which will block forever as well.

2. A deadlock requires a cycle in the dependency graph. But jobs submitted that are
still waiting in the queue, won't be executed yet, and, hence, won't be able to acquire
any resource to prevent other jobs from executing.

ciao,
Mario