This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: Futex error handling


On Thu, Oct 23, 2014 at 03:34:46PM -0400, Carlos O'Donell wrote:
> On 10/20/2014 12:14 PM, Torvald Riegel wrote:
> > On Tue, 2014-09-16 at 17:36 +0200, Torvald Riegel wrote:
> >> We got complains from the kernel side that glibc wouldn't react properly
> >> to futex errors being returned.  Thus, I'm looking at what we'd need to
> >> actually improve.  I'm using this here as a documentation for futex
> >> error codes: https://lkml.org/lkml/2014/5/15/356
> >>
> >> Generally, we have three categories of faults (ie, the cause for an
> >> error/failure):
> >> * Bug in glibc ("BL")
> >> * Bug in the client program ("BP")
> >> * Failures that are neither a bug in glibc nor the program ("F")
> >>
> >> Also, there are cases where it's not a "real" failure, but just
> >> something that is expected behavior that needs to be handled ("NF").
> >>
> >> I'm not aware of a general policy about whether glibc should abort or
> >> assert (ie, abort only with assertion checks enabled) when the fault is
> >> in the BL or BP categories.  I'd say we don't, because there's no way to
> >> handle it anyway, and other things will likely go wrong; but I don't
> >> have a strong opinion.  Thoughts?
> >>
> >> For every futex op, here's a list of how I'd categorize the possible
> >> error codes (I'm ignoring ENOSYS, which is NF when feature testing (or
> >> BL)):
> >>
> >> FUTEX_WAIT:
> >> * EFAULT is either BL or BP.  Nothing we can do.  Should have failed
> >> earlier when we accessed the futex variable.
> 
> Abort.

Yes.

> >> * EINVAL (alignment and timeout normalization) is BL/BP.
> 
> Abort.

For alignment, yes. For timeout normalization, no, this is a specified
error that must be passed back to the application. Of course if glibc
has already checked it in userspace, then further EINVAL from the
kernel should not be possible and aborting is in principle valid.

> >> * EWOULDBLOCK, ETIMEDOUT are NF.
> 
> Don't know.
> 
> WARNING: Aborting on EWOULDBLOCK is dangerous since we might just
> wish to deadlock the single thread, so a developer can attach with
> the debugger to inspect thread state while other threads keep running.

NF is never a case you would abort on. It means a condition that
occurs regularly in a valid program.

> >> FUTEX_WAKE, FUTEX_WAKE_OP:
> >> * EFAULT can be BL/BP *or* NF, so we *must not* abort or assert in this
> >> case.  This is due to how futexes work when combined with certain rules
> >> for destruction of the underlying synchronization data structure; see my
> >> description of the mutex destruction issue (but this can happen with
> >> other data structures such as semaphores or cond vars too):
> >> https://sourceware.org/ml/libc-alpha/2014-04/msg00075.html
> 
> Don't know.

This is complex, but unless glibc is going to take the effort (and
likely, cost) to prevent this mostly-harmless race, this case for
FUTEX_WAKE must be considered as NF. For FUTEX_WAKE_OP, EFAULT is
BL/BP if the fault is on the addr OP is applied to.

> >> * EINVAL (futex alignment) is BL/BP.
> 
> Abort.

Yes.

> >> * EINVAL (inconsistent state or hit a PI futex) can be either BL/BP *or*
> >> NF.  The latter is caused by the mutex destruction issue, only that a
> >> pending FUTEX_WAKE after destruction doesn't hit an inaccessible memory
> >> location but one which has been reused for a PI futex.  Thus, we must
> >> not abort or assert in this case.
> 
> Don't know.

I don't know much of anything about PI implementation details, so same
here -- don't know.

> >> FUTEX_REQUEUE:
> >> * Like FUTEX_WAKE, except that it's not safe to use concurrently with
> >> possible destruction / reuse of the futex memory (because requeueing to
> >> a futex that's unrelated to the new futex located in reused memory is
> >> bad).
> 
> Are there any error codes for FUTEX_REQUEUE or are you saying it's
> identical to FUTEX_WAKE?

I think some of the NF cases with FUTEX_WAKE are BL/BP for FUTEX_REQUEUE.

> >> FUTEX_REQUEUE_CMP:
> >> * Like FUTEX_REQUEUE.  EAGAIN is NF.
> 
> Is EGAIN the only one to consider?
> 
> Yes, EAGAIN needs some kind of special handling, but I don't know what.

It's the only additional case beyond FUTEX_REQUEUE.

> >> FUTEX_WAKE_OP:
> >> * Haven't looked at this yet.  Only used in condvars, and might not be
> >> necessary for a condvar that's not based on a condvar-internal lock.
> >>
> >> FUTEX_WAIT_BITSET / FUTEX_WAKE_BITSET:
> >> * Like FUTEX_WAIT / FUTEX_WAKE.  The additional EINVAL is BL.
> 
> Abort.
> 
> >> FUTEX_LOCK_PI, FUTEX_TRYLOCK_PI:
> >> * EFAULT is BL/BP.
> 
> Abort.

I think these are right.

> >> * ENOMEM is F.  We need to handle this.
> 
> Yes, pass it along.

Passing it along is probably not desirable; applications are not going
to be prepared to deal with ENOMEM from pthread_mutex_lock. But I'm
not sure what would be preferable.

> >> * EINVAL, EPERM, ESRCH are BL/BP.
> 
> Abort.

Are you sure ESRCH can't happen as a race with robust mutex owner
death?

> >> * EAGAIN and ETIMEDOUT are NF.
> 
> Don't know.

NF is NF.

> >> * EDEADLOCK is BP (or BL).
> 
> See WARNING above.

I don't see how it's related.

> >> * EOWNERDIED is F.
> 
> Don't know.

I would think it's NF..?

> >> FUTEX_UNLOCK_PI:
> >> (* I guess this can return EFAULT too, which is BL/BP.)
> >> * EINVAL and EPERM are BL/BP.  I don't think there's a mutex destruction
> >> issue with PI locks because the kernel takes care of both resetting the
> >> value of the futex var and waking up threads; it should do so in a way
> >> that won't access reused memory.  I guess we should check that though...
> 
> Abort.

I think this is correct but it should be checked.

> >> FUTEX_WAIT_REQUEUE_PI:
> >> * EFAULT and EINVAL are BL/BP.
> 
> Abort.
> 
> >> * EWOULDBLOCK and ETIMEDOUT are NF.
> 
> Don't know.

I think these look right.

> >> * EOWNERDIED is F.
> 
> Don't know. How did we get here?

Normal robust mutex usage.

> >> FUTEX_CMP_REQUEUE_PI is like FUTEX_CMP_REQUEUE except:
> >> * ENOMEM is F.
> 
> Pass it along.

How would this be passed along? Requeue is used internally in cond
vars. If requeue fails, it needs to be replaced with a simple wake
where the target thread will be responsible for attempting to wait on
the mutex (and possibly failing there, in which case it takes
responsibility for what to do).

> >> * EPERM and ESRCH are BL/BP.
> 
> Abort.

Not sure. Probably right.

> >> * EDEADLOCK is BP (or BL).
> 
> See WARNING above.

Again, unclear on this.

> >> I think the next steps to improve this should be:
> >> 1) Getting consensus on how we want to handle BL and BP in general.
> > 
> > Ping.  I think that's a fairly generic issue (in the sense of likely
> > being a question for other things than just mutexes too), so I'd like
> > some input on the direction.
> > 
> > Carlos, Joseph, Roland:  Do you have any comments?
> 
> Roland already responded, but I'll reiterate with wiki updates:
> 
> (1) Bugs in the GNU C library should fail early and visibly, balanced by
> runtime detection cost.
> 
> (2) Bugs in the user program should fail early and visibly, balanced by
> runtime detection cost in the GNU C library.
> 
> (3) Unknown error codes from the kernel should cause immediate failure.
> 
> https://sourceware.org/glibc/wiki/Style_and_Conventions#Error_Handling

I agree with these principles.

> >> 2) Applying the outcome of that to the list above and getting consensus
> >> on the result.
> >> 3) For each case of F, find the best way to report it to the caller
> >> (e.g., error code from the pthreads function, abort, ...).
> >> 4) Change each use of the futexes accordingly, one at a time.
> >>
> >> I've asked Michael Kerrisk for the state of the futex error docs, but
> >> haven't gotten a reply yet.  (Last time I checked, the new input from
> >> the email I referred to above wasn't part of the futex docs yet.)
> > 
> > Michael has replied, and it's on his list of things to do.
> 
> Hopefully that helps, maybe it doesn't, I answered the easy ones, and
> perhaps the first step is to fix the easy ones first :-)

:)

Rich


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]