pthread_cond_timedwait with timeout in the past slower than it used to be

Adhemerval Zanella adhemerval.zanella@linaro.org
Fri Nov 20 13:24:10 GMT 2020



On 18/11/2020 13:10, Mike Crowe via Libc-alpha wrote:
> Jonathan Wakely pointed out[1] on the libstdc++ mailing list that some of
> my patches that switch from using relative to absolute timeouts for futex
> waits caused a performance regression when passing timeouts in the past.
> The same performance regression also affects glibc.
> 
> Prior to glibc:99d01ffcc386d1bfb681fb0684fcf6a6a996beb3 calling
> pthread_cond_timedwait on a condition variable configured to use
> CLOCK_MONOTONIC would call clock_gettime to determine the current time so
> that it could convert the passed absolute time into a relative time to wait
> for (ultimately with FUTEX_WAIT). However, if the absolute timeout was in
> the past pthread_cond_timedwait returned indicating a timeout immediately
> without calling futex.
> 
> glibc:99d01ffcc386d1bfb681fb0684fcf6a6a996beb3 changed the code to use the
> absolute timeout directly (ultimately with FUTEX_WAIT_BITSET), so futex is
> now always called. This change was inspired by much older equivalent
> changes for CLOCK_REALTIME in
> glibc:cbd8aeb836c8061c23a5e00419e0fb25a34abee7,
> glibc:1bd57044e963abb886cb912beadea714815a3d5c and probably others.
> 
> The change meant that a call to pthread_cond_timedwait using
> CLOCK_MONOTONIC with a timeout in the past used to complete quickly since
> clock_gettime is implemented in the vdso and relatively cheap. It now
> always calls futex which is not implemented in the vdso and therefore more
> expensive.
> 
> This difference can be seen by running the program below against different
> glibc versions. Running the program on a machine with glibc v2.29 (Debian
> Buster) yields:
> 
>  cond_realtime mean duration 5568ns
>  cond_monotonic mean duration 398ns
> 
> whereas running it on a slightly-slower machine with glibc v2.31 (Debian
> Bullseye) yields:
> 
>  cond_realtime mean duration 6347ns
>  cond_monotonic mean duration 6346ns
> 
> cond_realtime calls futex with the absolute time in both glibc versions so
> the duration of each pthread_cond_timedwait call is similar as would be
> expected. cond_monotonic ends up only calling clock_gettime in glibc v2.29
> so it's considerably faster.
> 
> The performance change could be avoided by making pthread_cond_timedwait
> always calling clock_gettime and checking for expired timeouts before
> calling futex. However, this would mean that the cases that do need to wait
> end up consuming more CPU time before they get as far as blocking in the
> kernel.
> 
> (All this applies to pthread_cond_clockwait too, except it didn't exist in
> glibc:99d01ffcc386d1bfb681fb0684fcf6a6a996beb3.)
> 
> I suspect that similar issues have affected sem_timedwait,
> pthread_mutex_timedlock, pthread_rwlock_timedwrlock,
> pthread_rwlock_timedrdlock and their "clock" equivalents too.
> 
> Having explained all that, my question now is "does this matter?"

Thanks for the explanation, and although I do not consider this an
issue there is no impending reason why we can't have the very optimization
pthread_cond_wait used to provide back in.  In fact I was working on
some patches to consolidate and put the futex wait logic we already 
have on pthread_join_common.c (clockwait_tid) that is similar to the
logic we removed from pthread_cond_wait with the multiple clockid
support.

> 
> To help answer, I think the following things are important:
> 
> 1. The equivalent change for CLOCK_REALTIME happened many years ago (but
> not long enough ago that it predated the invention of the vdso), and
> presumably no-one complained loudly enough for it to be reverted.
> 
> 2. Fixing it makes callers who pass timeouts in the future do unnecessary
> work.
> 
> 3. It's not really clear how often production code ends up waiting on
> timeouts in the past. I can come up with loops that calculate their timeout
> at the top before doing some work that might take some time, but even then
> it doesn't seem likely that the wait taking a little bit longer is that
> bad.
> 
> 4. Any caller that really cares about this can check the timeout against
> the current time before calling pthread_cond_timedwait themselves.

I will send the patches I have so far that should fix issues you brought.
Even though I agree with Jonathan Wakely that it should not be a urgent
issue for glibc, I think changing on glibc is a optimization without 
downsides and it also allows some code consolidation and simplification.


More information about the Libc-alpha mailing list