This is the mail archive of the
glibc-bugs@sourceware.org
mailing list for the glibc project.
[Bug nptl/10465] New: Robust futex cleanup issues (kernel).
- From: "kkylheku at gmail dot com" <sourceware-bugzilla at sourceware dot org>
- To: glibc-bugs at sources dot redhat dot com
- Date: 30 Jul 2009 22:14:08 -0000
- Subject: [Bug nptl/10465] New: Robust futex cleanup issues (kernel).
- Reply-to: sourceware-bugzilla at sourceware dot org
I've discovered two bugs in the kernel which interfere with correct operation
of robust mutexes. I'm logging this in the glibc bugzilla, since this most
closely affects glibc. It's probably best for the glibc developers to review
and play with this.
Both bugs are in the kernel/futex.c module, in the handle_futex_death function.
Bug 1.
The first bug is a lost wakeup problem. It happens like this. Suppose that a
futex is being cleaned up which is owned by the calling thread. The
handle_futex_death function recognizes that this is so, and flips the state of
the futex so that the TID field is zero, and the FUTEX_OWNER_DIED flag is set.
After that it signals the futex to wake up one waiting task.
The problem is that the waiting task which is woken up can die before it is
able to acquire the futex. So then handle_futex_death will be called in the
context of that task. (Why? Although the task does not yet own the futex, and
so has not put it into its robust list, it has declared that it has an
operation pending for that futex!) So, this time, the TID field of the futex
is zero, and so it does not match the caller's thread ID. So the function does
nothing! The terminating thread has eaten the wakeup and died.
The correct logic for this function is to unconditionally signal the futex,
whether or not it is owned, just in case the terminating caller is the
incumbent owner who has received an exclusive wakeup. The futex wakeup must
occur outside of the body of the if statement which checks for the futex's TID
being that of the caller.
Bug 2.
Since 2.6.22, the kernel has supported a FUTEX_PRIVATE flag for more efficient
futexes that are assumed not to be shared across address spaces. The GNU C
library has added support for this: mutexes which are not process-shared use
this flag to request faster futex operations.
When this flag is used, it changes how a futex key is computed in the kernel.
It is a bug if a futex wait operation is done with the flag, but a wake
operation on the same futex is done without the flag, or vice versa.
Operations which are mismatched in this way will not rendezvous; they hash the
same futex location to different hash buckets using different keys.
In the library, the process-shared property of a mutex is independent of the
robust property. A robust mutex may be process shared (does not use the futex
private flag) or it may be private (uses the private flag).
However, the handle_futex_death function assumes that a futex being cleaned up
is always process shared. When invoking the futex_wake operation, it supplies
a non-null pointer for the fshared parameter, &curr->mm->mmap_sem:
futex_wake(uaddr, &curr->mm->mmap_sem, 1, FUTEX_BITSET_MATCH_ANY);
If the futex being cleaned up is process-private, this will not correctly wake
it up; a process-private futex should be woken like this:
futex_wake(uaddr, NULL, 1, FUTEX_BITSET_MATCH_ANY);
Since there is no nice way to tell whether or not a futex is process shared,
what the function can do is wake it up both ways.
I locally use the following patch against 2.6.26 which addresses both
problems. It moves the wakeup outside of the if, and performs two wake ups.
First the futex is woken as a private. If that doesn't find any waiting
threads, then it falls back on waking up the futex as if it were process
shared:
--- kernel/futex.c
+++ kernel/futex.c
@@ -1889,13 +1908,25 @@
if (nval != uval)
goto retry;
+ }
- /*
- * Wake robust non-PI futexes here. The wakeup of
- * PI futexes happens in exit_pi_state():
- */
- if (!pi && (uval & FUTEX_WAITERS))
- futex_wake(uaddr, &curr->mm->mmap_sem, 1,
+ /*
+ * Wake robust non-PI futexes here. The wakeup of
+ * PI futexes happens in exit_pi_state().
+ * Note that we don't know whether any of these futexes
+ * are shared or private! Robust mutexes don't have to be
+ * process shared. So we wake up each one both ways.
+ */
+ if (!pi) {
+ /* Wake using the cheaper process-private method first. */
+ int nr_woken = futex_wake(uaddr, NULL, 1,
+ FUTEX_BITSET_MATCH_ANY);
+
+ /* If none were found or the operation didn't work,
+ * do the more expensive process-shared hash.
+ */
+ if (nr_woken <= 0)
+ futex_wake(uaddr, &curr->mm->mmap_sem, 1,
FUTEX_BITSET_MATCH_ANY);
}
return 0;
--
Summary: Robust futex cleanup issues (kernel).
Product: glibc
Version: unspecified
Status: NEW
Severity: normal
Priority: P2
Component: nptl
AssignedTo: drepper at redhat dot com
ReportedBy: kkylheku at gmail dot com
CC: glibc-bugs at sources dot redhat dot com
http://sourceware.org/bugzilla/show_bug.cgi?id=10465
------- You are receiving this mail because: -------
You are on the CC list for the bug, or are watching someone who is.