This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

Re: [RFC] mutex destruction (#13690): problem description and workarounds

From: Rich Felker <dalias at libc dot org>
To: Carlos O'Donell <carlos at redhat dot com>
Cc: Torvald Riegel <triegel at redhat dot com>, GLIBC Devel <libc-alpha at sourceware dot org>
Date: Wed, 3 Dec 2014 09:28:37 -0500
Subject: Re: [RFC] mutex destruction (#13690): problem description and workarounds
Authentication-results: sourceware.org; auth=none
References: <1396621230 dot 10643 dot 7191 dot camel at triegel dot csb> <20141201153802 dot GV29621 at brightrain dot aerifal dot cx> <1417452125 dot 1771 dot 503 dot camel at triegel dot csb> <20141201170542 dot GY29621 at brightrain dot aerifal dot cx> <1417467150 dot 1771 dot 581 dot camel at triegel dot csb> <20141201212223 dot GZ29621 at brightrain dot aerifal dot cx> <1417553118 dot 3930 dot 14 dot camel at triegel dot csb> <20141202210316 dot GI29621 at brightrain dot aerifal dot cx> <547F17E3 dot 9060901 at redhat dot com>

On Wed, Dec 03, 2014 at 09:02:11AM -0500, Carlos O'Donell wrote:
> On 12/02/2014 04:03 PM, Rich Felker wrote:
> > On Tue, Dec 02, 2014 at 09:45:18PM +0100, Torvald Riegel wrote:
> >>> As far as I can tell that's just ***-covering by the man page with no
> >>> basis in what the kernel actually does. If there are actually current
> >>> situations under which it can produce EINTR without a signal, that's
> >>> very bad. For instance sem_wait must return EINTR when actually
> >>> interrupted by a non-SA_RESTART signal, but it's forbidden from
> >>> returning EINTR if that didn't happen.
> >>
> >> EINTR is a 'may fail'.  POSIX states that sem_wait is interruptible, but
> >> I read this as allowing interruption, not requiring it.
> > 
> > Indeed, I had missed this. It seems preferable that an implementation
> > not act on such interruptions, and at least this choice allows an
> > "out" if the kernel has the broken behavior of spurious EINTRs, but I
> > still think it's better for the kernel never to return EINTR except
> > for genuine interruption-by-signal.
> 
> I agree. The conflation of EINTR for non-signal use is IMO going to be
> a design decision we regret in the future.
>  
> >> The signal man pages list sem_wait as having to return EINTR if
> >> interrupted, but what's the point?
> > 
> > This applies to all uses of interrupting signal handlers, which is why
> > I personally think they should be deprecated. However, you can work
> > around the issue by repeating the signal with exponential backoff
> > until the thread sending the signal can determine that the target
> > thread has acted upon the interruption.
> 
> Or avoid relying on EINTR and cancel the thread?

Unfortunately cancellation is inappropriate for some situations.
Consider a program that wants to cancel an operation being performed
by a thread, but keep the thread alive for a future operation, because
if the thread terminates there's no guarantee that a new one can be
created, and no way to recover from the situation of not being able to
recreate it. This kind of thread reuse is essential for applications
that need to be robust/fail-safe.

I have in mind an extension to the cancellation API that would address
this issue: a cancellation mode that causes the operation detecting
cancellation to return with ECANCELED rather than acting on
cancellation. I still have some details to work out on how it should
work in certain corner cases, but I'd like to implement it
experimentally for musl, and possibly propose it in the future for
glibc and for standardization in POSIX at some point, if there's
interest.

But in the mean time, the above kinds of ugly EINTR hacks seem to be
the only way to stop blocked operations without terminating the
thread.

Rich

Follow-Ups:
- Re: [RFC] mutex destruction (#13690): problem description and workarounds
  - From: Torvald Riegel

References:
- Re: [RFC] mutex destruction (#13690): problem description and workarounds
  - From: Rich Felker
- Re: [RFC] mutex destruction (#13690): problem description and workarounds
  - From: Torvald Riegel
- Re: [RFC] mutex destruction (#13690): problem description and workarounds
  - From: Rich Felker
- Re: [RFC] mutex destruction (#13690): problem description and workarounds
  - From: Torvald Riegel
- Re: [RFC] mutex destruction (#13690): problem description and workarounds
  - From: Rich Felker
- Re: [RFC] mutex destruction (#13690): problem description and workarounds
  - From: Torvald Riegel
- Re: [RFC] mutex destruction (#13690): problem description and workarounds
  - From: Rich Felker
- Re: [RFC] mutex destruction (#13690): problem description and workarounds
  - From: Carlos O'Donell

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]