Race in dispatch_semaphore_wait(DISPATCH_TIME_NOW)
Hi! In our application we assume that dispatch_semaphore_wait(sema, DISPATCH_TIME_NOW) will never block, but recently we found it waiting on OS semaphore. Checking source of libdispatch (we used version 215, but the latest available 339.90.1 seems similar) we found the following scenario that could lead to blocking. Thread1: dispatch_semaphore_wait(sema, DISPATCH_TIME_NOW); Thread2: dispatch_semaphore_signal(sema); dispatch_semaphore_wait(sema, DISPATCH_TIME_FOREVER); Initial state: dsema_value = 0, internal OS semaphore count = 0 1. Thread 1 enters dispatch_semaphore_wait() and decrements dsema_value -> -1, it's < 0, so going slow path. 2. Thread 2 signals semaphore, dsema_value incremented from -1 to 0, also going slow path and signals OS semaphore, incrementing its count from 0 to 1. 3. Thread 1 enters DISPATCH_TIME_NOW case in switch in dispatch_semaphore_wait_slow(), but since dsema_value == 0 falls through to DISPATCH_TIME_FOREVER case. 4. Thread 2 enters dispatch_semaphore_wait, decrements dsema_value from 0 to -1, goes slow path, waits on OS semaphore, decrements its count back to 0 and exits dispatch_semaphore_wait() 5. Thread 1 waits forever on OS semaphore. Interesting that simple test that we wrote to expose that race can easily show it on Linux (e.g. need 1-5 runs). More runs are needed to see it on Solaris (~5-10 runs). But we failed to make it hung on OS X (10.9.3) Is there some changes that are not included in open source libdispatch? Please check if our analysis is correct. Thanks!
Hi Dmitri, This is almost surely a bug in the port to these platforms. Apple’s operating system uses Mach to implement this feature, but Mach is not available on Linux and Solaris. DaveZ
On May 21, 2014, at 2:14 AM, Dmitri Shubin <sbn@tbricks.com> wrote:
Hi!
In our application we assume that dispatch_semaphore_wait(sema, DISPATCH_TIME_NOW) will never block, but recently we found it waiting on OS semaphore.
Checking source of libdispatch (we used version 215, but the latest available 339.90.1 seems similar) we found the following scenario that could lead to blocking.
Thread1: dispatch_semaphore_wait(sema, DISPATCH_TIME_NOW);
Thread2: dispatch_semaphore_signal(sema); dispatch_semaphore_wait(sema, DISPATCH_TIME_FOREVER);
Initial state: dsema_value = 0, internal OS semaphore count = 0
1. Thread 1 enters dispatch_semaphore_wait() and decrements dsema_value -> -1, it's < 0, so going slow path.
2. Thread 2 signals semaphore, dsema_value incremented from -1 to 0, also going slow path and signals OS semaphore, incrementing its count from 0 to 1.
3. Thread 1 enters DISPATCH_TIME_NOW case in switch in dispatch_semaphore_wait_slow(), but since dsema_value == 0 falls through to DISPATCH_TIME_FOREVER case.
4. Thread 2 enters dispatch_semaphore_wait, decrements dsema_value from 0 to -1, goes slow path, waits on OS semaphore, decrements its count back to 0 and exits dispatch_semaphore_wait()
5. Thread 1 waits forever on OS semaphore.
Interesting that simple test that we wrote to expose that race can easily show it on Linux (e.g. need 1-5 runs). More runs are needed to see it on Solaris (~5-10 runs). But we failed to make it hung on OS X (10.9.3) Is there some changes that are not included in open source libdispatch?
Please check if our analysis is correct. Thanks! _______________________________________________ libdispatch-dev mailing list libdispatch-dev@lists.macosforge.org https://lists.macosforge.org/mailman/listinfo/libdispatch-dev
participants (2)
-
Dave Zarzycki
-
Dmitri Shubin