This is the mail archive of the
mailing list for the glibc project.
Re: [PATCH 1/3] Use reliable sem_wait interruption in nptl/tst-sem6.
- From: Torvald Riegel <triegel at redhat dot com>
- To: Rich Felker <dalias at libc dot org>
- Cc: OndÅej BÃlka <neleai at seznam dot cz>, GLIBC Devel <libc-alpha at sourceware dot org>
- Date: Tue, 09 Dec 2014 19:57:08 +0100
- Subject: Re: [PATCH 1/3] Use reliable sem_wait interruption in nptl/tst-sem6.
- Authentication-results: sourceware.org; auth=none
- References: <1417804668 dot 22797 dot 108 dot camel at triegel dot csb> <1417805577 dot 25868 dot 4 dot camel at triegel dot csb> <20141206135040 dot GA16212 at domone> <1418038997 dot 25868 dot 34 dot camel at triegel dot csb> <20141208222857 dot GB13499 at domone> <1418120160 dot 25868 dot 132 dot camel at triegel dot csb> <20141209165033 dot GA20499 at domone> <1418149489 dot 25868 dot 230 dot camel at triegel dot csb> <20141209183647 dot GN4574 at brightrain dot aerifal dot cx>
On Tue, 2014-12-09 at 13:36 -0500, Rich Felker wrote:
> On Tue, Dec 09, 2014 at 07:24:49PM +0100, Torvald Riegel wrote:
> > > Which does not answer my objection. What extra bugs could this test catch,
> > > compared to say tst-sem2? If there is no such bug you could just delete
> > > that file.
> > tst-sem2 tests that spurious wake-ups and such don't return anything but
> > -1 and errno==EINTR, in particular that 0 isn't returned.
> > After the patch, tst-sem6 tests that a signal handler that posts a token
> > will make sem_wait return. It *also* allows for sem_wait to return -1
> > and errno==EINTR in that case.
> > Thus, one possible error that the patched tst-sem6 will catch is if the
> > sem_wait itself just retries the futex_wait after the futex_wait
> > returned EINTR, instead of looking for whether there is an available
> > token.
> This would not be a bug. Simply retrying the futex_wait would result
> in EAGAIN, since the futex value would no longer match.
Right. So it would catch a bug that did a futex_wait after loading the
> > > > Also, please review the actual background for this change. See patch
> > > > 3/3 for that. The change to the test is not ideal, but please see the
> > > > trade-offs.
> > >
> > > The rarity of problem bugged me, as to trigger that behaviour one would
> > > need run also highly parallel program so it looked unlikely that anybody
> > > would report that. As it couldn't detect some bugs it previously could
> > > its hard to see what is better.
> > Let me try to summarize the background behind this change again:
> > 1) Linux documents futex_wait to return EINTR on signals *or* on
> > spurious wake-ups.
> No, the man pages document this, and they're wrong. I have not seen
> any other "Linux documentation" claiming it.
But is there other documentation than the man pages? The sources don't
really count because that's not a guarantee nor a specification, that's
the current implementation.
Also, at least one kernel person seems to have confirmed that the
current manpage is correct: https://lkml.org/lkml/2014/5/15/356
So in absence of any other documentation, I'll follow what we have and
for which we have at least some documentation.
> > 2) If we treat 1) as true -- which we should to unless getting
> > confirmation otherwise -- sem_wait must not return EINTR to the caller
> > anymore if futex_wait returned EINTR.
> > 3) Because of 2), the behavior that is tested in tst-sem6 before my
> > patch cannot be implemented anymore.
> These (2) and (3) are based on false assumptions.
I don't have any evidence to rely on something else. Don't get me
wrong, if we get confirmation from the kernel that 1) is not true, then
I'm open to doing something else. But until then, what should we do?
Also, the change is within what's allowed by POSIX IMO, so we're not
inventing new behavior here.
> I agree this is a
> positive change (EINTR is generally undesirable) and it may be
> necessary if you want to support old kernels where the futex syscall
> could fail with EINTR even when the signal handler was SA_RESTART
> type, but I don't think glibc supports those kernels.
> > Thus, we need to do *something*. I proposed this patch, and variations.
> > If you don't see other alternatives, then I guess we'll have to pick
> > from the options I gave.
> > If you disagree with 2), then please comment on Patch 3/3, because
> > that's where this belongs.
> The most important thing to do is get clarification from the kernel
> side that the man page is wrong and that there is no intent to
> overload EINTR or allow incorrect/spurious EINTR from futex.
Once we'll get this, I'll take care to adjust sem_wait accordingly, and
depending on the adjustment, might adapt tst-sem6 as well.
BTW, have you already asked on LKML about this?