This is the mail archive of the
libc-alpha@sourceware.org
mailing list for the glibc project.
Re: [PATCH 1/3] Use reliable sem_wait interruption in nptl/tst-sem6.
- From: Rich Felker <dalias at libc dot org>
- To: Torvald Riegel <triegel at redhat dot com>
- Cc: OndÅej BÃlka <neleai at seznam dot cz>, GLIBC Devel <libc-alpha at sourceware dot org>
- Date: Tue, 9 Dec 2014 15:19:07 -0500
- Subject: Re: [PATCH 1/3] Use reliable sem_wait interruption in nptl/tst-sem6.
- Authentication-results: sourceware.org; auth=none
- References: <1417804668 dot 22797 dot 108 dot camel at triegel dot csb> <1417805577 dot 25868 dot 4 dot camel at triegel dot csb> <20141206135040 dot GA16212 at domone> <1418038997 dot 25868 dot 34 dot camel at triegel dot csb> <20141208222857 dot GB13499 at domone> <1418120160 dot 25868 dot 132 dot camel at triegel dot csb> <20141209165033 dot GA20499 at domone> <1418149489 dot 25868 dot 230 dot camel at triegel dot csb> <20141209183647 dot GN4574 at brightrain dot aerifal dot cx> <1418151428 dot 25868 dot 238 dot camel at triegel dot csb>
On Tue, Dec 09, 2014 at 07:57:08PM +0100, Torvald Riegel wrote:
> On Tue, 2014-12-09 at 13:36 -0500, Rich Felker wrote:
> > On Tue, Dec 09, 2014 at 07:24:49PM +0100, Torvald Riegel wrote:
> > > > Which does not answer my objection. What extra bugs could this test catch,
> > > > compared to say tst-sem2? If there is no such bug you could just delete
> > > > that file.
> > >
> > > tst-sem2 tests that spurious wake-ups and such don't return anything but
> > > -1 and errno==EINTR, in particular that 0 isn't returned.
> > >
> > > After the patch, tst-sem6 tests that a signal handler that posts a token
> > > will make sem_wait return. It *also* allows for sem_wait to return -1
> > > and errno==EINTR in that case.
> > >
> > > Thus, one possible error that the patched tst-sem6 will catch is if the
> > > sem_wait itself just retries the futex_wait after the futex_wait
> > > returned EINTR, instead of looking for whether there is an available
> > > token.
> >
> > This would not be a bug. Simply retrying the futex_wait would result
> > in EAGAIN, since the futex value would no longer match.
>
> Right. So it would catch a bug that did a futex_wait after loading the
> new value.
I don't follow. If I understand what type of bug you're talking about,
there's no way such a bug would arise accidentally and only affect
EINTR. It would be a break in the whole usage pattern for futex waits
and would affect EAGAIN and non-spurious wakes too unless someone
intentionally special-cased EINTR to do the wrong thing.
> > > Let me try to summarize the background behind this change again:
> > >
> > > 1) Linux documents futex_wait to return EINTR on signals *or* on
> > > spurious wake-ups.
> >
> > No, the man pages document this, and they're wrong. I have not seen
> > any other "Linux documentation" claiming it.
>
> But is there other documentation than the man pages? The sources don't
> really count because that's not a guarantee nor a specification, that's
> the current implementation.
>
> Also, at least one kernel person seems to have confirmed that the
> current manpage is correct: https://lkml.org/lkml/2014/5/15/356
The linked mailing list message does not contain the text EINTR at
all, so I don't see where your claim that it supports the current man
page text about EINTR comes from.
> > > 2) If we treat 1) as true -- which we should to unless getting
> > > confirmation otherwise -- sem_wait must not return EINTR to the caller
> > > anymore if futex_wait returned EINTR.
> > > 3) Because of 2), the behavior that is tested in tst-sem6 before my
> > > patch cannot be implemented anymore.
> >
> > These (2) and (3) are based on false assumptions.
>
> I don't have any evidence to rely on something else. Don't get me
> wrong, if we get confirmation from the kernel that 1) is not true, then
> I'm open to doing something else. But until then, what should we do?
>
> Also, the change is within what's allowed by POSIX IMO, so we're not
> inventing new behavior here.
It's allowed by POSIX, yes, and as I've said before, I agree it's
better behavior -- programming with interrupting signal handlers is a
backwards, bogus practice, and from a hardening standpoint it seems
preferable not to have sem_wait fail at all. I just don't think the
"spurious EINTR is documented" argument should be used to justify such
a change, because accepting spurious EINTR is going to come back to
bite us if there are ever other interfaces (I believe aio_suspend
already is one?) that need to be implemented with futex and need to
report EINTR.
Rich