This is the mail archive of the
mailing list for the glibc project.
Re: Make sem_timedwait use FUTEX_CLOCK_REALTIME (bug 18138)
- From: Torvald Riegel <triegel at redhat dot com>
- To: Joseph Myers <joseph at codesourcery dot com>
- Cc: libc-alpha at sourceware dot org, carlos at redhat dot com
- Date: Wed, 18 Mar 2015 23:34:03 +0100
- Subject: Re: Make sem_timedwait use FUTEX_CLOCK_REALTIME (bug 18138)
- Authentication-results: sourceware.org; auth=none
- References: <alpine dot DEB dot 2 dot 10 dot 1503180027220 dot 9536 at digraph dot polyomino dot org dot uk> <1426677277 dot 21475 dot 27 dot camel at triegel dot csb> <alpine dot DEB dot 2 dot 10 dot 1503181657260 dot 6097 at digraph dot polyomino dot org dot uk>
On Wed, 2015-03-18 at 17:03 +0000, Joseph Myers wrote:
> On Wed, 18 Mar 2015, Torvald Riegel wrote:
> > On Wed, 2015-03-18 at 00:28 +0000, Joseph Myers wrote:
> > > If a relative
> > > timeout is passed to the kernel, it is interpreted according to the
> > > CLOCK_MONOTONIC clock, and so fails to meet that POSIX requirement in
> > > the event of clock changes.
> > Thanks for realizing the clock change issue, which I had overlooked.
> > (I'd say that strictly speaking, this is "only" a QoI issue though,
> > because we give no hard guarantees around timing.)
> I think you could produce a test that demonstrates an unambiguous bug, not
> depending (for the assertion that the observed results are buggy) on any
> particular latency requirements not in POSIX. Say, one thread performs
> sem_timedwait, while the other sets the clock forward past the timeout
> specified in the sem_timedwait call (required to cause sem_timedwait to
> terminate with ETIMEDOUT), then posts the semaphore. If sem_timedwait
> terminates successfully, that means that the requirement "If the absolute
> time requested at the invocation of such a time service is before the new
> value of the clock, the time service shall expire immediately as if the
> clock had reached the requested time normally." was not met. With the
> existing code using relative timeouts, such a test would in practice have
> sem_timedwait succeed, while POSIX does not permit it to succeed.
Hmm. I don't quite agree, or I don't understand your example.
I agree that when we do a timed wait using a relative timeout for the
futex, we do not time out immediately when we advance the clock. This
could also happen in executions where we time out immediately, but then
sem_timedwait needs a very long time to actually return to the caller.
(Which would be a QoI issue.)
In your example, posting the semaphore after setting the time would
allow a window where the posting could cause normal completion of
sem_timedwait (ie, while it is still waiting erroneously). However,
first, one would technically still need a guarantee when this setting of
the time happened, perhaps via calibration/comparison with other clocks
(ie, it can't happen a very long time after it was intended, or it could
have been right before or after the end of the very long waiting of
sem_timedwait (in wall clock time)).
Nonetheless, such an behavior is still possible with your patch: if a
spurious wake-up happens on the futex right before the other thread sets
the clock, then sem_timedwait will check for a token to be available,
which might happen right after the other thread posted to the semaphore.
That could be avoided by looking at the current time after each futex
wait operation and returning a timeout error in such a case -- but I'm
not sure that the costs of this are worth it.
(Nonetheless, I agree that we want to avoid very long waiting times that
would be surprising to users; we certainly don't want to give timing
guarantees for small time intervals (say, less than a second), but
waiting for one day or such would be bad.)