Sources Bugzilla – Bug 11588
pthread condvars are not priority inheritance aware
Last modified: 2013-04-06 09:15:38 UTC
When using a PTHREAD_PRIO_INHERIT mutex with a condvar, the pthread_cond* calls
can still cause an unbounded priority inversion via the internal condvar lock.
The POSIX specification doesn't provide a mechanism to specify the protocol of
the condvar. We would like to do this at runtime, but unfortunately it is legal
to call pthread_cond_signal() or pthread_cond_broadcast() without first waiting
on the lock, so the mutex type may not be known the first time the condvar is
used. A new API, pthread_condattr_setprotocol_np() and
pthread_condattr_getprotocol_np(), would allow the user to create a
PTHREAD_PRIO_INHERIT condvar. This would use a PTHREAD_PRIO_INHERIT mutex for
the internal condvar lock, eliminating the potential for hitting an unbounded
priority inversion on that lock.
This topic was initially discussed here:
And again here:
Test cases and patches for evaluation will be attached.
Created attachment 4779 [details]
Unbounded priority inversion testcase pthread_cond_hang.c
Using vanilla cond vars results in a pthread_cond_wait priority inversion
deadlock. With PI cond vars it works fine. This test uses dlsym to use the new
API if it exists.
Created attachment 4780 [details]
pi condvar: pthread_condattr_setprotocol_np 2.12 port
Created attachment 4781 [details]
pi condvar: add tst-condpi1.c basic API test
Created attachment 4782 [details]
pi condvar patch impact on normal path testcase
This test is designed to test the efficiency of the pthread_cond_wait() and
pthread_cond_signal() code paths within glibc. Ulrich has expressed concern
about these patches negatively impacting the non-pi condvar paths. I felt the
changes to the common case (when PI is not involved) were pretty minimal.
Certainly any significant regression there would be unacceptable.
I built three versions of glibc 2.11.1 from git:
1) git: unmodified git sources
2) c_only: pthread_cond*.S files deleted
3) pi_condvar: same as c_only with the pi_condvar patches applied
Comparing #3 against #2 allows us to eliminate any gains #1 would have solely
from the hand written asm. 3 will eventually contain hand written asm, but
until the non-posix API is agreed upon, it doesn't make sense to expend the
effort of writing the asm code in my opinion.
I then ran 10 runs of 10M iterations each at SCHED_FIFO 1 priority on each of
the three glibcs, the results (following) suggest no significant change in the
non PI condvar performance, sitting right at ~270k (avg) cycles/sec for each
Min: 202379.703125 us
Max: 285067.28125 us
Avg: 268823.0375 us
Min: 190936.03125 us
Max: 290366.21875 us
Avg: 269647.525 us
Min: 262115.796875 us
Max: 284592.6875 us
Avg: 271861.535938 us
This is only the cond_signal case, and doesn't account for cond_timedwait or
cond_broadcast, but I wouldn't expect those to experience any additional impact
from this patch. Are there scenarios you can think of that are likely to suffer
greater impact that are not covered by this rather simple test case?
While working up tst-condpi2.c (make check version of pthread_cond_hang.c), I
found something went wrong with my port attached as "0001-pi-condvar-
pthread_condattr_setprotocol_np-2.12-port.patch" and the test case still hangs
when affined to CPU 0. I'll look into why and get an updated patch posted as soon
Turns out the trouble in comment 5 was due to my forgetting to remove the
nptl/pthread_cond*.S files before building the library this time around. The
original 0001 patch did indeed function correctly. I have reworked the patches,
improved the patch headers and naming as well as added a new tst-condpi2.c
testcase which tests for priority inversion avoidance. A follow-on patch detects
an unbounded inversion and exits gracefully. Without the last patch, the glibc
test framework will kill the test after the 2 second timeout - which may be the
preferred approach. I'll attach the updated patches now.
Created attachment 4785 [details]
Unbounded priority inversion test case
The first version (attachment 4779 [details]) was missing the "_np" in the dl lookup.
This fixes that.
Created attachment 4786 [details]
[PATCH 1/4 V2] pi-condvars: add protocol support to pthread_condattr_t
Updated patch header. Functionally equivalent to attachment 4780 [details].
Created attachment 4787 [details]
[PATCH 2/4 V2] pi-condvars: add tst-condpi1.c, basic API test
Updated patch header.
Created attachment 4788 [details]
[PATCH 3/4 V2] pi-condvars: add tst-condpi2.c, priority inversion avoidance test
New "make check" test, tst-condpi2.c, to test priority inversion avoidance.
Created attachment 4789 [details]
[PATCH 4/4 V2] pi-condvars: tst-condpi2.c, detect priority inversion
Optional patch. This goes beyond relying on the glibc 2 second timeout to abort
the test. Instead, it detects the unbounded priority inversion and exits
gracefully. Depending on the preferred approach, this patch can be ignored
without a problem.
It was brought to my attention that the Signed-off-by's in the patches attached
and sent to libc-alpha are not appropriate for glibc development. This was just
an artifact of my git usage, and I will correct in subsequent versions. May we
discuss these as is, as we our foremost concern is agreeing on the proposed new
condattr protocol APIs?
I have posted V3 of the patches to libc-alpha. Rather than bog bugzilla down
additional attachments, I'll wait until an agreement has been made on
These patches have proper ChangeLogs and have removed the requeue-pi C
implementation which the previous versions were carrying along.
Created attachment 4821 [details]
pi condvar patch impact on normal path testcase
Updated condvar_perf testcase which runs the previous test multiple times and
calculates various statistics to aid in determining the impact of the
Ulrich, before I spend more time on the test cases and possibly trying to
optimize the current implementation, can you comment on if you have any objection
to the proposed API changes themselves?
(In reply to comment #15)
> Ulrich, before I spend more time on the test cases and possibly trying to
> optimize the current implementation, can you comment on if you have any objection
> to the proposed API changes themselves?
Is there any chance that this issue is going to make any progress ? It affects
real users and we have a working solution for it.
Are there any issues with the approach itself ? If not, what needs to be done to
get this resolved ?
(In reply to comment #16)
> Is there any chance that this issue is going to make any progress ? It affects
> real users and we have a working solution for it.
I can only confirm this. We just saw that unfortunate deadlock again in real
application (something to be shipped in products). We ran into it before, but
this time we finally realized that it's not due to an outdated glibc but rather
a still unfixed bug.
Our applications consist of dozens of real-time threads at different prio
levels and practically can't be reworked to avoid cond variables in RT threads.
So we really need a solution, and that preferably without having to carry our
own patched glibc versions.
Ping on this unfixed bug. Is there a chance to reconsider the extension Darren
proposed? We are stuck with unhandy and inefficient PRIO_PROTECT here due to
this issue. And I'm sure we aren't alone...
Note that the condvar implementation will have to change due to a recent POSIX
change (see bug 13165). That means that how to solve this bug might also
depend on how we fix bug 13165. It would be good to fix both in one go.