Re: Futex error handling

On Tue, Sep 16, 2014 at 08:12:38PM +0200, Torvald Riegel wrote:
> > > * EFAULT can be BL/BP *or* NF, so we *must not* abort or assert in this
> > > case.  This is due to how futexes work when combined with certain rules
> > > for destruction of the underlying synchronization data structure; see my
> > > description of the mutex destruction issue (but this can happen with
> > > other data structures such as semaphores or cond vars too):
> > >
> > 
> > Note that it's possible to use FUTEX_WAKE_OP in such a way that EFAULT
> > is reserved for BL/BP (and not NF). I don't see any point in
> > having/using FUTEX_WAKE_OP except for this purpose, but maybe I'm
> > missing something.
> I agree that I was a bit sloppy in the categorization.  You're right
> that depending on how it's used, EFAULT can be just BL/BP.  This applies
> to both FUTEX_WAKE and FUTEX_WAKE_OP, I think; the latter has just a
> finite number of bits, so you can't avoid an ABA issue entirely.  You

I'm not sure what ABA issue you have in mind. The EFAULT case with
FUTEX_WAKE, and which I claim FUTEX_WAKE_OP avoids, is when the atomic
operation on the futex int that's associated with the wake allows
another thread to synchronize and determine that it may legally
destroy the object before the actual wake is sent. FUTEX_WAKE_OP can
fully avoid this by performing the atomic operation after looking up
and locking the futex hash bucket, so that there's no further access
after the atomic and thus no opportunity for fault.

> So, to summarize, my categories kind of assume a "typical" use of those
> operations in glibc.  What I was trying to point out is that we can't
> abort in the generic futex syscall code when we see EFAULT, because
> that's wrong for typical uses of FUTEX_WAKE.

Yes, I agree with this. I'm not clear yet on whether it would be an
advantage to use FUTEX_WAKE_OP to avoid this, but I think it's
plausible that it might be.


