This is the mail archive of the
mailing list for the glibc project.
Re: [PATCH] New condvar implementation that provides stronger ordering guarantees.
- From: Rich Felker <dalias at libc dot org>
- To: Torvald Riegel <triegel at redhat dot com>
- Cc: GLIBC Devel <libc-alpha at sourceware dot org>, Marcus Shawcroft <marcus dot shawcroft at gmail dot com>, "Joseph S. Myers" <joseph at codesourcery dot com>, Richard Henderson <rth at redhat dot com>, Carlos O'Donell <codonell at redhat dot com>, Mike Frysinger <vapier at gentoo dot org>, Chung-Lin Tang <chunglin_tang at mentor dot com>, Adhemerval Zanella <azanella at linux dot vnet dot ibm dot com>, Andreas Krebbel <krebbel at linux dot ibm dot com>, Kaz Kojima <kkojima at rr dot iij4u dot or dot jp>, Chris Metcalf <cmetcalf at tilera dot com>, David Miller <davem at davemloft dot net>, Darren Hart <dvhart at infradead dot org>
- Date: Mon, 23 Feb 2015 13:34:47 -0500
- Subject: Re: [PATCH] New condvar implementation that provides stronger ordering guarantees.
- Authentication-results: sourceware.org; auth=none
- References: <1424456307 dot 20941 dot 122 dot camel at triegel dot csb> <20150222223722 dot GA23507 at brightrain dot aerifal dot cx> <1424690809 dot 22790 dot 32 dot camel at triegel dot csb> <20150223175939 dot GE23507 at brightrain dot aerifal dot cx> <1424714997 dot 22790 dot 40 dot camel at triegel dot csb>
On Mon, Feb 23, 2015 at 07:09:57PM +0100, Torvald Riegel wrote:
> On Mon, 2015-02-23 at 12:59 -0500, Rich Felker wrote:
> > On Mon, Feb 23, 2015 at 12:26:49PM +0100, Torvald Riegel wrote:
> > > On Sun, 2015-02-22 at 17:37 -0500, Rich Felker wrote:
> > > > On Fri, Feb 20, 2015 at 07:18:27PM +0100, Torvald Riegel wrote:
> > > > > + Limitations:
> > > > > + * This condvar isn't designed to allow for more than
> > > > > + WSEQ_THRESHOLD * (1 << (sizeof(GENERATION) * 8 - 1)) calls to
> > > > > + __pthread_cond_wait. It probably only suffers from potential ABA issues
> > > > > + afterwards, but this hasn't been checked nor tested.
> > > > > + * More than (1 << (sizeof(QUIESCENCE_WAITERS) * 8) -1 concurrent waiters
> > > > > + are not supported.
> > > > > + * Beyond what is allowed as errors by POSIX or documented, we can also
> > > > > + return the following errors:
> > > > > + * EPERM if MUTEX is a recursive mutex and the caller doesn't own it.
> > > >
> > > > This is not beyond POSIX; it's explicitly specified as a "shall fail".
> > >
> > > http://pubs.opengroup.org/onlinepubs/9699919799/functions/pthread_cond_wait.html
> > >
> > > [EPERM]
> > > The mutex type is PTHREAD_MUTEX_ERRORCHECK or the mutex is a
> > > robust mutex, and the current thread does not own the mutex.
> > >
> > > POSIX does not seem to allow EPERM for *recursive mutexes*. Is there an
> > > update that I'm missing?
> > Well it doesn't specifically require it for recursive (I missed that)
> > but it also doesn't disallow it.
> Yes, it doesn't disallow it explicitly, but for it to be allowed, it
> would have to be listed at least in the "may fail", right?
Arbitrary implementation-defined errors are allowed by POSIX, with
some restrictions. It's not permitted to reuse one of the standard
"shall fail" or "may fail" errors to diagnose a semantically different
condition that could prevent accurately diagnosing the standard error,
but I think you can argue that this doesn't apply here since the
non-standard use of EPERM would only happen in a usage case (different
type of mutex) from the standard-specified one. But I think adding an
explicit "may fail" for recursive mutexes would be nicer.
> > > > > + * EOWNERDEAD or ENOTRECOVERABLE when using robust mutexes. Unlike
> > > > > + for other errors, this can happen when we re-acquire the mutex; this
> > > > > + isn't allowed by POSIX (which requires all errors to virtually happen
> > > > > + before we release the mutex or change the condvar state), but there's
> > > > > + nothing we can do really.
> > > >
> > > > Likewise these are "shall fail" errors specified by POSIX, and while
> > > > it's not clearly written in the specification, it's clear that they
> > > > only happen on re-locking.
> > >
> > > Yes, they are "shall fail". I also agree that POSIX *should* make it
> > > clear that they can happen after releasing and when acquiring the mutex
> > > again -- but that's not what the spec says:
> > >
> > > "Except in the case of [ETIMEDOUT], all these error checks shall act as
> > > if they were performed immediately at the beginning of processing for
> > > the function and shall cause an error return, in effect, prior to
> > > modifying the state of the mutex[...]"
> > OK, then I think that text is a bug. There's no way that mutex locking
> > errors could meaningful before the mutex is unlocked.
> > > Until these two get clarified in the spec, I consider the comments
> > > correct. We can certainly extend them and document why we thing this
> > > behavior is The Right Thing To Do. But we need to document where we
> > > deviate from what the spec states literally.
> > Yes. Would you like to submit the bug report or should I?
> If you have some time, please do.