This is the mail archive of the
libc-alpha@sourceware.org
mailing list for the glibc project.
Re: [RFC] mutex destruction (#13690): problem description and workarounds
- From: Rich Felker <dalias at libc dot org>
- To: Torvald Riegel <triegel at redhat dot com>
- Cc: Carlos O'Donell <carlos at redhat dot com>, GLIBC Devel <libc-alpha at sourceware dot org>
- Date: Thu, 4 Dec 2014 13:57:26 -0500
- Subject: Re: [RFC] mutex destruction (#13690): problem description and workarounds
- Authentication-results: sourceware.org; auth=none
- References: <20141201170542 dot GY29621 at brightrain dot aerifal dot cx> <1417467150 dot 1771 dot 581 dot camel at triegel dot csb> <20141201212223 dot GZ29621 at brightrain dot aerifal dot cx> <1417553118 dot 3930 dot 14 dot camel at triegel dot csb> <20141202210316 dot GI29621 at brightrain dot aerifal dot cx> <547F17E3 dot 9060901 at redhat dot com> <1417703533 dot 22797 dot 16 dot camel at triegel dot csb> <5480807D dot 3040309 at redhat dot com> <20141204173402 dot GQ4574 at brightrain dot aerifal dot cx> <1417719246 dot 22797 dot 34 dot camel at triegel dot csb>
On Thu, Dec 04, 2014 at 07:54:06PM +0100, Torvald Riegel wrote:
> On Thu, 2014-12-04 at 12:34 -0500, Rich Felker wrote:
> > On Thu, Dec 04, 2014 at 10:40:45AM -0500, Carlos O'Donell wrote:
> > > On 12/04/2014 09:32 AM, Torvald Riegel wrote:
> > > >> I agree. The conflation of EINTR for non-signal use is IMO going to be
> > > >> a design decision we regret in the future.
> > > >
> > > > I'd rather see the fault in POSIX semantics, and it not making it clear
> > > > that signal handlers should do sem_post if they need to reliably
> > > > interrupt a sem_wait.
> > >
> > > If we are going to disallow a signal to interrupt sem_post we should just
> > > change the semantics, version the interface, and document that glibc no
> > > longer ever returns EINTR for sem_wait, and that the right way to interrupt
> > > it is with a signal handler that does sem_post.
> > >
> > > This prevents users from complaining that what they observe with strace
> > > and gdb is a signal arriving after the sem_wait, but not interrupting it.
> > > We can claim the user is looking under the hood, but that's what they do,
> > > and if we can possibly avoid those arguments we win. We know we're right,
> > > we know we don't want to allow timing to imply ordering, but we need time
> > > to educate developers (and that looking under the hood leads to non-obvious
> > > observations).
> > >
> > > I really wish the kernel returned some other error code for woken up
> > > vs. signal. Is it not possible to get the kernel to distinguish these
> > > two? Am I forgetting something?
> >
> > It *DOES*. It returns 0 for woken-up, and EINTR for
> > interrupted-by-signal.
>
> No. See man 2 futex, return values of FUTEX_WAIT:
> "Signals (see signal(7)) or other spurious wakeups cause FUTEX_WAIT
> to fail with the error EINTR."
>
> The LKML message that expanded on other error codes states that existing
> wording for FUTEX_WAIT "seems ok": https://lkml.org/lkml/2014/5/15/356
>
> So, EINTR is currently documented as happening *either* due to a signal
> or spuriously.
This documentation is incorrect. There is currently no cause of EINTR
other than signals, nor has there been in the past. I'll ask Michael
to fix this.
Rich