Problems with the new pthread clock implementations

Mike Crowe mac@mcrowe.com
Sat Nov 21 17:54:04 GMT 2020


Hi Michael,

On Saturday 21 November 2020 at 07:59:04 +0100, Michael Kerrisk (man-pages) wrote:
> I've been taking a closer look at the the new pthread*clock*() APIs:
> pthread_clockjoin_np()
> pthread_cond_clockwait()
> pthread_mutex_clocklock()
> pthread_rwlock_clockrdlock()
> pthread_rwlock_clockwrlock()
> sem_clockwait()
> 
> I've noticed some oddities, and at least a couple of bugs.
> 
> First off, I just note that there's a surprisingly wide variation in
> the low-level futex calls being used by these APIs when implementing
> CLOCK_REALTIME support:
> 
> pthread_rwlock_clockrdlock()
> pthread_rwlock_clockwrlock()
> sem_clockwait()
> pthread_cond_clockwait()
>     futex(addr,
>         FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 3,
>         {abstimespec}, FUTEX_BITSET_MATCH_ANY)
>     (This implementation seems to be okay)
> 
> pthread_clockjoin_np()
>     futex(addr, FUTEX_WAIT, 48711, {reltimespec})
>     (This is buggy; see below.)
> 
> pthread_mutex_clocklock()
>     futex(addr, FUTEX_WAIT_PRIVATE, 2, {reltimespec})
>     (There's bugs and strangeness here; see below.)

Yes, I found it very confusing when I started adding the new
pthread*clock*() functions, and it still takes me a while to find the right
functions when I look now. I believe that Adhemerval was talking about
simplifying some of this.

> === Bugs ===
> 
> pthread_clockjoin_np():
> As already recognized in another mail thread [1], this API accepts any
> kind of clockid, even though it doesn't support most of them.

Well, it sort of does support them at least as well as many other
implementations of such functions do - it just calculates a relative
timeout using the supplied lock and then uses that. But, ...

> A further bug is that even if CLOCK_REALTIME is specified,
> pthread_clockjoin_np() sleeps against the CLOCK_MONOTONIC clock.
> (Currently it does this for *all* clockid values.) The problem here is
> that the FUTEX_WAIT operation sleeps against the CLOCK_MONOTONIC clock
> by default. At the least, the FUTEX_CLOCK_REALTIME is required for
> this case. Alternatively, an implementation using
> FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME (like the first four
> functions listed above) might be appropriate.

...this is one downside of that. That bug was inherited from the
existing pthread_clock_timedjoin_np implementation.

I was planning to write a patch to just limit the supported clocks, but
I'll have a go at fixing the bug you describe properly instead first which
will limit the implementation to CLOCK_REALTIME and CLOCK_MONOTONIC anyway.

> ===
> 
> pthread_mutex_clocklock():
> First of all, there's a small oddity. Suppose we specify the clockid
> as CLOCK_REALTIME, and then while the call is blocked, we set the
> clock realtime backwards. Then, there will be further futex calls to
> handle the modification to the clock (and possibly multiple futex
> calls if the realtime clock is adjusted repeatedly):
> 
>         futex(addr, FUTEX_WAIT_PRIVATE, 2, {reltimespec1})
>         futex(addr, FUTEX_WAIT_PRIVATE, 2, {reltimespec2})
>         ...
> 
> Then there seems to be a bug. If we specify the clockid as
> CLOCK_REALTIME, and while the call is blocked we set the realtime
> clock forwards, then the blocking interval of the call is *not*
> adjusted (shortened), when of course it should be.

This is because __lll_clocklock_wait ends up doing a relative wait rather
than an absolute one so it suffers from the same problem as
pthread_clockjoin_np.

> ===
> 
> I've attached a couple of small test programs at the end of this mail.

Thanks for looking at this in detail.

AFAIK, all of these bugs also affected the corresponding existing
pthread*timed*() functions. When I added the new pthread*clock*() functions
I was trying to keep my changes to the existing code as small as possible.
(I started out trying to "scratch the itch" of libstdc++
std::condition_variable::wait_for misbehaving[2] when the system clock was
warped in 2015 and all of this ballooned from that.) Now that the functions
are in, I think there's definitely scope for improving the implementation
and I will try to do so as time and confidence allows - the implementation
of __pthread_mutex_clocklock_common scares me greatly!

Thanks.

Mike.

[1] https://lore.kernel.org/linux-man/20201119120034.GA20599@mcrowe.com/
[2] https://randombitsofuselessinformation.blogspot.com/2018/06/its-about-time-monotonic-time.html


More information about the Libc-alpha mailing list