This is the mail archive of the
mailing list for the glibc project.
Re: [PATCH 1/3] Use reliable sem_wait interruption in nptl/tst-sem6.
- From: Torvald Riegel <triegel at redhat dot com>
- To: Rich Felker <dalias at libc dot org>
- Cc: OndÅej BÃlka <neleai at seznam dot cz>, GLIBC Devel <libc-alpha at sourceware dot org>
- Date: Wed, 10 Dec 2014 10:34:30 +0100
- Subject: Re: [PATCH 1/3] Use reliable sem_wait interruption in nptl/tst-sem6.
- Authentication-results: sourceware.org; auth=none
- References: <1417804668 dot 22797 dot 108 dot camel at triegel dot csb> <1417805577 dot 25868 dot 4 dot camel at triegel dot csb> <20141206135040 dot GA16212 at domone> <1418038997 dot 25868 dot 34 dot camel at triegel dot csb> <20141208222857 dot GB13499 at domone> <1418120160 dot 25868 dot 132 dot camel at triegel dot csb> <20141209165033 dot GA20499 at domone> <1418149489 dot 25868 dot 230 dot camel at triegel dot csb> <20141209183647 dot GN4574 at brightrain dot aerifal dot cx> <1418151428 dot 25868 dot 238 dot camel at triegel dot csb> <20141209201907 dot GP4574 at brightrain dot aerifal dot cx>
On Tue, 2014-12-09 at 15:19 -0500, Rich Felker wrote:
> On Tue, Dec 09, 2014 at 07:57:08PM +0100, Torvald Riegel wrote:
> > On Tue, 2014-12-09 at 13:36 -0500, Rich Felker wrote:
> > > On Tue, Dec 09, 2014 at 07:24:49PM +0100, Torvald Riegel wrote:
> > > > > Which does not answer my objection. What extra bugs could this test catch,
> > > > > compared to say tst-sem2? If there is no such bug you could just delete
> > > > > that file.
> > > >
> > > > tst-sem2 tests that spurious wake-ups and such don't return anything but
> > > > -1 and errno==EINTR, in particular that 0 isn't returned.
> > > >
> > > > After the patch, tst-sem6 tests that a signal handler that posts a token
> > > > will make sem_wait return. It *also* allows for sem_wait to return -1
> > > > and errno==EINTR in that case.
> > > >
> > > > Thus, one possible error that the patched tst-sem6 will catch is if the
> > > > sem_wait itself just retries the futex_wait after the futex_wait
> > > > returned EINTR, instead of looking for whether there is an available
> > > > token.
> > >
> > > This would not be a bug. Simply retrying the futex_wait would result
> > > in EAGAIN, since the futex value would no longer match.
> > Right. So it would catch a bug that did a futex_wait after loading the
> > new value.
> I don't follow. If I understand what type of bug you're talking about,
> there's no way such a bug would arise accidentally and only affect
> EINTR. It would be a break in the whole usage pattern for futex waits
> and would affect EAGAIN and non-spurious wakes too unless someone
> intentionally special-cased EINTR to do the wrong thing.
> > > > Let me try to summarize the background behind this change again:
> > > >
> > > > 1) Linux documents futex_wait to return EINTR on signals *or* on
> > > > spurious wake-ups.
> > >
> > > No, the man pages document this, and they're wrong. I have not seen
> > > any other "Linux documentation" claiming it.
> > But is there other documentation than the man pages? The sources don't
> > really count because that's not a guarantee nor a specification, that's
> > the current implementation.
> > Also, at least one kernel person seems to have confirmed that the
> > current manpage is correct: https://lkml.org/lkml/2014/5/15/356
> The linked mailing list message does not contain the text EINTR at
> all, so I don't see where your claim that it supports the current man
> page text about EINTR comes from.
The mail states:
< Existing blurb seems ok >
... and then it goes on providing revised documentation for some of the
previously documented error codes and adds documentation for a few
additional ones. It doesn't mention 0 or EINTR otherwise, so I read
this as meaning that the docs for 0 and EINTR are correct.
> > > > 2) If we treat 1) as true -- which we should to unless getting
> > > > confirmation otherwise -- sem_wait must not return EINTR to the caller
> > > > anymore if futex_wait returned EINTR.
> > > > 3) Because of 2), the behavior that is tested in tst-sem6 before my
> > > > patch cannot be implemented anymore.
> > >
> > > These (2) and (3) are based on false assumptions.
> > I don't have any evidence to rely on something else. Don't get me
> > wrong, if we get confirmation from the kernel that 1) is not true, then
> > I'm open to doing something else. But until then, what should we do?
> > Also, the change is within what's allowed by POSIX IMO, so we're not
> > inventing new behavior here.
> It's allowed by POSIX, yes, and as I've said before, I agree it's
> better behavior -- programming with interrupting signal handlers is a
> backwards, bogus practice, and from a hardening standpoint it seems
> preferable not to have sem_wait fail at all.
> I just don't think the
> "spurious EINTR is documented" argument should be used to justify such
> a change, because accepting spurious EINTR is going to come back to
> bite us if there are ever other interfaces (I believe aio_suspend
> already is one?) that need to be implemented with futex and need to
> report EINTR.
I don't want to argue against that. But we need to separate the issues
here. This topic is something that the kernel has to decide, so we can
only discuss this with them. Whether or not glibc sem_wait follows
what's currently documented doesn't change that other decision at all.
Thus, let's keep those separate topics separate. We can try to clarify
with the kernel folks re EINTR before we agree on the new semaphore
implementation, but I wouldn't like to hold up the new semaphore just
because we can't agree with the kernel soon enough.