This is the mail archive of the
libc-alpha@sourceware.org
mailing list for the glibc project.
Re: [PATCH 0/6][BZ #11588] pi-condvars: add priority inheritance for pthread_cond_* internal lock
- From: Rich Felker <dalias at libc dot org>
- To: Torvald Riegel <triegel at redhat dot com>
- Cc: gratian dot crisan at ni dot com, libc-alpha at sourceware dot org, Darren Hart <dvhart at linux dot intel dot com>, Carlos O'Donell <carlos at redhat dot com>, Joseph Myers <joseph at codesourcery dot com>, Jeff Law <law at redhat dot com>, Scot Salmon <scot dot salmon at ni dot com>, Siddhesh Poyarekar <spoyarek at redhat dot com>, Thomas Gleixner <tglx at linutronix dot de>, Clark Williams <williams at redhat dot com>, "Paul E. McKenney" <paulmck at linux dot vnet dot ibm dot com>, Will Newton <will dot newton at linaro dot org>, gratian at gmail dot com
- Date: Mon, 18 Aug 2014 23:50:24 -0400
- Subject: Re: [PATCH 0/6][BZ #11588] pi-condvars: add priority inheritance for pthread_cond_* internal lock
- Authentication-results: sourceware.org; auth=none
- References: <OF6ABEE614 dot FAE80AD2-ON86257D0E dot 006B38F4-86257D0E dot 0070034A at ni dot com> <1406680317-20189-1-git-send-email-gratian dot crisan at ni dot com> <1408396459 dot 14301 dot 29 dot camel at triegel dot csb>
On Mon, Aug 18, 2014 at 11:14:19PM +0200, Torvald Riegel wrote:
> On Tue, 2014-07-29 at 19:31 -0500, gratian.crisan@ni.com wrote:
> > Torvald Riegel made us aware of the new POSIX changes related to condvars
> > (http://austingroupbugs.net/view.php?id=609) and C++11 clarification
> > (http://cplusplus.github.com/LWG/lwg-active.html#2190)
> > We believe we can work on these issues in parallel and if they end up
> > colliding we will fix it.
>
> I've asked the Austin Group about the current status of #609. It seems
> they want the stronger ordering guarantees:
> http://austingroupbugs.net/view.php?id=609#c2349
>
> I have an implementation that fulfills these guarantees, but I don't
> think it's possible to fully implement PI with the stronger guarantees
> and the futex operations that we currently have. The possible options
Note that the kernel basically already has all the functionality
needed to do fully-safe cond vars as part of the futex system, but
just doesn't expose it. The problems are:
- There's no way to do a futex wait that doesn't restart if
interrupted by a signal.
- There's no way to release the mutex atomically with doing the futex
wait.
Note that the kernel has a FUTEX_WAKE_OP command that lets you perform
writes after acquiring the futex key and hash bucket, but there's no
similar command for doing writes after getting ready to wait. What
would be ideal would be a FUTEX_WAIT_OP command that:
1. Acquires futex key and hash buckets for uaddr and uaddr2.
2. Performs an atomic operation on uaddr2.
3. Performs the comparison on the old value from uaddr2 and possibly
performs a futex wake on the already-acquired uaddr2 futex.
4. Waits on the already-acquired uaddr futex, but without
restartability of the syscall.
This is basically a full WAIT analogue for the current FUTEX_WAKE_OP
command. It would allow the cond var implementation to unlock the
mutex atomically with waiting on the cond var futex, and would make it
possible to achieve signal and broadcast with trivial use of
FUTEX_WAKE. Most importantly, it would solve both the sequence number
issue AND the self-synchronized destruction issue (destroying or
unmapping the cond var immediately after the last waiter is unblocked)
since the associated implementation of process-shared cond vars would
never access the cond vard object at all (except to read the pshared
flag and other attributes, before performing the operation); the
entire wait operation takes place simply using the address as a futex
key, without ever reading from or writing to it.
I'm not sure if this proposed FUTEX_WAIT_OP would be useful for
private cond vars. In some ways it's certainly attractive, but I'm not
sure how easy it would be to use with requeue, and in general private
gives you a lot more opportunities to optimize in userspace. But for
process-shared, it seems like the ideal solution to all the open
issues.
Rich