This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: Futex error handling


On Tue, 2014-09-16 at 14:54 -0400, Rich Felker wrote:
> On Tue, Sep 16, 2014 at 08:12:38PM +0200, Torvald Riegel wrote:
> > > > FUTEX_WAKE, FUTEX_WAKE_OP:
> > > > * EFAULT can be BL/BP *or* NF, so we *must not* abort or assert in this
> > > > case.  This is due to how futexes work when combined with certain rules
> > > > for destruction of the underlying synchronization data structure; see my
> > > > description of the mutex destruction issue (but this can happen with
> > > > other data structures such as semaphores or cond vars too):
> > > > https://sourceware.org/ml/libc-alpha/2014-04/msg00075.html
> > > 
> > > Note that it's possible to use FUTEX_WAKE_OP in such a way that EFAULT
> > > is reserved for BL/BP (and not NF). I don't see any point in
> > > having/using FUTEX_WAKE_OP except for this purpose, but maybe I'm
> > > missing something.
> > 
> > I agree that I was a bit sloppy in the categorization.  You're right
> > that depending on how it's used, EFAULT can be just BL/BP.  This applies
> > to both FUTEX_WAKE and FUTEX_WAKE_OP, I think; the latter has just a
> > finite number of bits, so you can't avoid an ABA issue entirely.  You
> 
> I'm not sure what ABA issue you have in mind.

Sorry, I must have been thinking about the _BITSET variants when I wrote
this.  Not sure why... :)

> The EFAULT case with
> FUTEX_WAKE, and which I claim FUTEX_WAKE_OP avoids, is when the atomic
> operation on the futex int that's associated with the wake allows
> another thread to synchronize and determine that it may legally
> destroy the object before the actual wake is sent. FUTEX_WAKE_OP can
> fully avoid this by performing the atomic operation after looking up
> and locking the futex hash bucket, so that there's no further access
> after the atomic and thus no opportunity for fault.

Agreed; that like what UNLOCK_PI does.  However, and that's something
I've only thought about recently, it would be good to know which
guarantees the kernel gives in this case; in particular, what happens
(and which error code results) if there is destruction and potential
unmapping etc. of the futex variable concurrently with WAKE_OP or
UNLOCK_PI being in flight.

> > So, to summarize, my categories kind of assume a "typical" use of those
> > operations in glibc.  What I was trying to point out is that we can't
> > abort in the generic futex syscall code when we see EFAULT, because
> > that's wrong for typical uses of FUTEX_WAKE.
> 
> Yes, I agree with this. I'm not clear yet on whether it would be an
> advantage to use FUTEX_WAKE_OP to avoid this, but I think it's
> plausible that it might be.

It should work to avoid the issue, but then wake-up latency will be
higher (because the kernel has to do it).  Not a good trade-off to make,
I think.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]