This is the mail archive of the
libc-alpha@sourceware.org
mailing list for the glibc project.
Re: [RFC] mutex destruction (#13690): problem description and workarounds
- From: Torvald Riegel <triegel at redhat dot com>
- To: "Carlos O'Donell" <carlos at redhat dot com>
- Cc: GLIBC Devel <libc-alpha at sourceware dot org>
- Date: Thu, 27 Nov 2014 12:11:09 +0100
- Subject: Re: [RFC] mutex destruction (#13690): problem description and workarounds
- Authentication-results: sourceware.org; auth=none
- References: <1396621230 dot 10643 dot 7191 dot camel at triegel dot csb> <54762514 dot 2030301 at redhat dot com>
On Wed, 2014-11-26 at 14:08 -0500, Carlos O'Donell wrote:
> On 04/04/2014 10:20 AM, Torvald Riegel wrote:
> > The problem at hand is that the glibc implementation assumes that
> > mutexes can be destroyed as soon as no thread is referencing a mutex
> > anymore in the sense of no function calls that use the mutex being
> > executed anymore.
>
> Is the difficulty with the definition the meaning of referenced?
> There is no way to talk about referenced with respect to other
> operations, when does the reference start, when does it end etc.
glibc code basically requires for correctness that no mutex calls are in
flight when the mutex is destroyed or after it has been destroyed (ie,
in particular no unlock is concurrently executing with destruction).
> > This first sub-issue would be straightforward to fix, but it's actually
> > the smaller issue here. The bigger one is that we have a pending
> > FUTEX_WAKE syscall on a memory location that can be reused for something
> > else. "Pending" here means that a thread can unlock the futex var (ie,
> > mutex.__data.__lock) and thus make the mutex available to be locked
> > and/or destructed by other threads, be suspended for whatever reason,
> > and then at any time later, resume and issue the futex syscall.
>
> This is always the problem with a lock that protects access to the
> destruction of itself.
>
> One can quickly grow a lock hierarchy which ends in an unfreeable object
> and use that to control access to freeing and allocating mutexes, but
> that way doesn't scale IMO.
Yes, performance is one issue. The other one is that with
process-shared mutexes, you can't simply use extra memory locations
outside of the mutex itself; so implementing something deferred becomes
impossible, or slow if you need to do it immediately..
> > === Workaround 1: Specify that spurious FUTEX_WAKE calls are possible
> >
> > FUTEX_WAIT is currently specified to return 0 if it got woken by a
> > FUTEX_WAKE call, and EINTR when wake-up happened due to a signal or
> > other spurious wakeup. The memory reuse thus possibly allows for
> > unrelated FUTEX_WAKE calls being injected into the code running a new
> > futex after memory reuse. If this code relies on the FUTEX_WAIT return
> > values to be correct (just regarding 0 vs. EINTR!), the pending
> > unrelated FUTEX_WAKE calls can break it.
> >
> > However, the existence of such code should be fairly unlikely:
> > * All futexes that are used for multiple wait/wake-up cycles (like
> > pthread mutexes, barriers, semaphores, ...) will have to be robust to
> > their *own* pending FUTEX_WAKEs anyway because this is how futexes are
> > designed.
> > * PI futexes do not seem to be woken by FUTEX_WAIT according to some
> > quick testing: FUTEX_LOCK_PI simply doesn't wake up and FUTEX_WAIT
> > returns success, whereas FUTEX_WAIT_REQUEUE_PI isn't woken either but
> > FUTEX_WAIT returns EINVAL (likely due to the other mutex being passed in
> > being NULL (which we'd need to perhaps check in the glibc code)). I
> > wasn't able to find any specification for these cases, however.
> >
> > I've reviewed glibc uses of FUTEX_WAIT (x86_64 and/or generic) and they
> > should all be robust to pending FUTEX_WAKEs. Likewise for GCC's libgomp
> > and libitm. I suppose that cases where there might be pending
> > FUTEX_CMP_REQUEUE calls (e.g., condvars) should be containable by using
> > a lock (and FUTEX_WAKE) at destruction.
> >
> > The case that could be affected by such a specification change is thus
> > single-wake-up futexes in programs that themselves never make those
> > futexes subject to the pending-FUTEX_WAIT-while-memory-reuse problem.
> > (I'm mentioning the latter because it's really not just glibc that's
> > affected, but futex users in general. So, arguably, correct programs
> > that use futexes have to deal with the issue already.)
> >
> > Pros:
> > * Little code change required in glibc to fix the problems.
> > * Typical mutex-using code shouldn't be affected by the specification
> > change (assuming that glibc/libgomp/libitm are representative).
> >
> > Cons:
> > * Change to futex specs.
>
> This is a "relaxed semantic" workaround.
>
> I don't like workaround 1 because it changes the semantics of the
> specification in ways that might adversely affect users. I am hesitant
> to remove the potentially useful and semantically simple non-spurious
> case. Despite the fact that Rich comments that there is a kernel bug,
> it might just be a kernel bug that we get a spurious wake.
I'm not really happy about it either, but it could be the easiest option
we have. It has some risk in that it might break one-shot futex uses in
custom synchronization code -- but I've never seen these in the wild.
> > === Workaround 1a: New FUTEX_WAKE_SPURIOUS operation that avoids the
> > specification change
> >
> > This is like Workaround 1, except that the kernel could add a new futex
> > op that works like FUTEX_WAKE except that:
> > * FUTEX_WAITs woken up by a FUTEX_WAKE_SPURIOUS will always return
> > EINTR. EINTR for spurious wakeups is already part of the spec, so
> > correct futex users are already handling this (e.g., glibc does).
> > * Make sure (and specify) that FUTEX_WAKE_SPURIOUS that hit other
> > futexes (e.g., PI) are ignored and don't cause wake-ups (or just benign
> > spurious wakeups already specified).
> >
> > Users of FUTEX_WAKE_SPURIOUS should have to do very little compared to
> > when using FUTEX_WAKE. The only thing that they don't have anymore is
> > the ability to distinguish between a real wakeup and a spurious one.
> > Single-use FUTEX_WAITs could be affected, but we don't have them in
> > glibc. The only other benefit from being able to distinguish between
> > real and spurious is in combination with a timeout: If the wake-up is
> > real on a single-use futex, there's no need to check timeouts again.
> > But will programs want to use this often, and will they need to have to
> > use FUTEX_WAKE_SPURIOUS in this case? I guess not.
> >
> > Pros:
> > * Correct futex uses will need no changes.
> > Cons:
> > * Needs a new futex operation.
>
> This is a "new semantic" workaround.
>
> I like this option the best. It isolates the change to a new operation
> and that operation allows a specific userspace implementation to have
> the wakeup always return EINTR such that a wakeup that carries over to
> a new futex (reusing the memory) doesn't think it was woken correctly.
One thing that is missing in my description of this workaround is that
we probably need a new FUTEX_WAIT_SPURIOUS syscall as well, so that we
can distinguish not only spurious from non-spurious wake calls, but only
those waiters that can tolerate spurious wake-ups from those that
cannot.