Bug 13184 - rwlocks can go into dead lock
Summary: rwlocks can go into dead lock
Status: RESOLVED WONTFIX
Alias: None
Product: glibc
Classification: Unclassified
Component: nptl (show other bugs)
Version: 2.14
: P2 critical
Target Milestone: ---
Assignee: Not yet assigned to anyone
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2011-09-12 21:18 UTC by Frank Denis
Modified: 2014-06-13 14:29 UTC (History)
4 users (show)

See Also:
Host:
Target: i586
Build:
Last reconfirmed:
fweimer: security-


Attachments
Fix (294 bytes, patch)
2012-01-16 21:47 UTC, Frank Denis
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Frank Denis 2011-09-12 21:18:42 UTC
This issue can be reliably reproduced on glibc 2.7, glibc 2.11, glibc 1.14 and glibc-current (stock, without any vendor patches).

A race condition can cause a pthreads rwlock to get a negative number of __nr_readers, thus leading to a dead lock.

This issue seems to be hardware dependent. I haven't been able to reproduce it on single-core Opterons nor Intel hardware. But it happens in minutes on different boxes, all powered by dual-core Opteron 270.

A test program triggering this issue can be found at: http://download.pureftpd.org/misc/nptl-bugs/crazylock.c
Comment 1 Rich Felker 2011-09-12 21:32:43 UTC
Please fix your test case; as is, it's invalid because it invokes undefined behavior as soon as thread_main tries to read automatic storage from spawn_threads which may have already returned. (I'm getting immediate assertion failure in cache_get due to n containing nonsense.) You could fix this by passing n by value (casting it to (void *)) rather than by reference.

I'm very interested in race condition bugs like this since I'm writing a threads implementation and bugs in glibc/NPTL often end up either pointing to similar bugs in my code, or lead to ideas for superior implementations. But I'm not yet convince you've found a bug, since your test code is at least mildly buggy...
Comment 2 Frank Denis 2011-09-12 21:42:09 UTC
Hi Rich,

And thanks for your feedback. The automatic storage passed by reference bug has been fixed.
However, the race condition still happens: 

crazylock-nlibc: crazylock.c:26: cache_get: Assertion `rwlock.__data.__nr_reader s <= n' failed.
crazylock-nlibc: crazylock.c:32: cache_get: Assertion `rwlock.__data.__nr_reader s < n' failed.
crazylock-nlibc: crazylock.c:32: cache_get: Assertion `rwlock.__data.__nr_reader s < n' failed.

The boxes this bug happens on happen to have 2x dual-core Opteron 270.
Comment 3 Rich Felker 2011-09-12 22:25:26 UTC
I'm confused how the assertions relate to your original claim that __nr_readers can go negative. Having __nr_readers momentarily exceed the actual number of readers is not necessarily a bug unless it actually overflows and causes a later comparison to evaluate incorrectly, and this test program does not seem to be able to establish that it overflows.

BTW, I tried testing on a dual-core Atom and could not get the assertions to fail.
Comment 4 Frank Denis 2011-09-12 22:41:35 UTC
__nr_readers actually goes negative. When a deadlock occurs, __nb_readers equals 0xFFFFFFFE.
Comment 5 Frank Denis 2011-09-13 02:01:52 UTC
Please note that this bug happens on dual-core Opterons, but only with x86 targets.
The exact same code never deadlocks on x86_64, on the same hardware.
Comment 6 Philippe Mathieu 2012-01-16 20:02:45 UTC
I've observed the same problem on Opteron 270 processors. Even for a x86_64 targets.

> ./crazylock 4
crazylock: crazylock.c:37: cache_get: Assertion `rwlock.__data.__nr_readers < n' failed.
crazylock: crazylock.c:31: cache_get: Assertion `rwlock.__data.__nr_readers <= n' failed.
crazylock: crazylock.c:31: cache_get: Assertion `rwlock.__data.__nr_readers <= n' failed.
crazylock: crazylock.c:37: cache_get: Assertion `rwlock.__data.__nr_readers < n' failed.

> file crazylock
crazylock: ELF 64-bit LSB executable, x86-64, version 1 (GNU/Linux), dynamically linked (uses shared libs), for GNU/Linux 2.6.18, not stripped
Comment 7 Frank Denis 2012-01-16 21:47:31 UTC
Created attachment 6160 [details]
Fix
Comment 8 Siddhesh Poyarekar 2012-10-04 12:31:31 UTC
I don't have an Opteron 270 to test this on, but it looks like something that shouldn't happen, since the instruction you've 'locked' with your patch should already be isolated with lock.__data.__lock.  If the patch actually works, then this might possibly even be a cpu bug that we're patching.

If you can get someone from AMD to comment on this or give a rationale for why the bug appears and why your fix works, then please post the patch on libc-alpha for review.  You may want to follow the checklist here for your patch submission:

http://sourceware.org/glibc/wiki/Contribution%20checklist
Comment 9 Rich Felker 2012-10-04 12:36:49 UTC
I suspect it's this bug: http://timetobleed.com/mysql-doesnt-always-suck-this-time-its-amd/
Comment 10 Siddhesh Poyarekar 2012-10-04 13:13:10 UTC
Thanks for digging that out; it looks like a perfect candidate for this report.  I don't think it makes sense to fix this in any way though.  If cmpxchg does not work, then there are bound to be bigger problems than just atomically decrementing/incrementing nr_readers and the real fix is to fix/change the hardware.  Frank, please go ahead and post your patch anyway, so that we know what the glibc community consensus is on this.
Comment 11 Rich Felker 2012-10-04 17:55:54 UTC
I agree that it's ridiculous to try to fix a hardware bug like this. The kernel really should just refuse to do SMP on broken hardware like this unless it can do a transparent workaround in kernelspace. Then AMD can deal with replacing all the broken CPUs they sold when people start complaining that SMP no longer works. Atomic ops being broken and needing workarounds is not part of the x86 or x86_64 ABI.

With that said, it would be nice to understand the bug. Is is that lock-prefixed read-modify-write operations (like "lock;inc (%reg)") get reordered with respect to xchg or lock-prefixed cmpxchg? Or is it only non-lock-prefixed read-modify-write operations (like "inc (%reg)") that are getting reordered with respect to the latter?

I don't think there's any workaround that would not incur significant costs on non-broken systems, and even if there were, I think "fixing" this is wrong in principle.
Comment 12 Rich Felker 2012-10-04 17:55:54 UTC
I agree that it's ridiculous to try to fix a hardware bug like this. The kernel really should just refuse to do SMP on broken hardware like this unless it can do a transparent workaround in kernelspace. Then AMD can deal with replacing all the broken CPUs they sold when people start complaining that SMP no longer works. Atomic ops being broken and needing workarounds is not part of the x86 or x86_64 ABI.

With that said, it would be nice to understand the bug. Is is that lock-prefixed read-modify-write operations (like "lock;inc (%reg)") get reordered with respect to xchg or lock-prefixed cmpxchg? Or is it only non-lock-prefixed read-modify-write operations (like "inc (%reg)") that are getting reordered with respect to the latter?

I don't think there's any workaround that would not incur significant costs on non-broken systems, and even if there were, I think "fixing" this is wrong in principle.
Comment 13 Florian Weimer 2014-06-13 14:29:07 UTC
I don't think we can fix such hardware errata, certainly not without support from the CPU vendor.

In this particular case, the affected CPU was released in 2005, so it is not very likely this part is available anymore.