[libdispatch-dev] lib dispatch worker threads may loop depending on compiler optimization
Dmitri Shubin
sbn at tbricks.com
Fri Sep 9 00:51:28 PDT 2011
On 09.09.2011 11:39, Paolo Bonzini wrote:
> Yes, the documentation is conservative. However, if you look at the
> code (and this hasn't changed in recent GCC):
>
> * moving references before the builtin is clearly prohibited, and so
> is speculating them;
>
> * depending on the target, memory stores may not be globally visible
> yet, and previous memory loads may not yet be satisfied;
AFAIU this is critical here -- global store (tail->do_next) should be
visible.
>
> * however, the compiler will *never* sink references below the
> builtin, which is what I meant by "compiler-wise it is always a full
> optimization barrier" like asm("":::"memory").
>
> So I find it extremely unlikely that this is the cause of the problem.
>
> It is more likely that an optimization barrier like the above no-op
> asm is missing in the source, and clang is getting away without it.
> Remember that while the x86 does not need explicit read or write
> barriers in the assembly (only full barriers), you do need to write
> the barriers in the code and expand them to no-op asms. Otherwise the
> compiler may move references across the barrier.
>
Yes, that was the fix -- adding __asm__ __volatile__("" ::: "memory")
before call to __sync_lock_test_and_set().
But what about other targets (not x86) -- looks like they need true
write memory barrier here?
More information about the libdispatch-dev
mailing list