Bug 23229 - Realtime rwlock deadlock when writer has lower priority than readers
Summary: Realtime rwlock deadlock when writer has lower priority than readers
Status: RESOLVED DUPLICATE of bug 13701
Alias: None
Product: glibc
Classification: Unclassified
Component: nptl (show other bugs)
Version: 2.29
: P2 normal
Target Milestone: ---
Assignee: Not yet assigned to anyone
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2018-05-24 13:40 UTC by Tobias Ringstrom
Modified: 2018-06-20 13:08 UTC (History)
3 users (show)

See Also:
Host:
Target:
Build:
Last reconfirmed:
fweimer: security-


Attachments
Test program to reproduce the deadlock (742 bytes, text/x-csrc)
2018-05-24 13:40 UTC, Tobias Ringstrom
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Tobias Ringstrom 2018-05-24 13:40:44 UTC
Created attachment 11044 [details]
Test program to reproduce the deadlock

The new rwlock implementation runs into a deadlock when using real-time scheduling and the writer has lower priority than the writers.

Here's how the rwlock looks when deadlocked:

    $7 = {__data = {__readers = 35, __writers = 0, __wrphase_futex = 0,
          __writers_futex = 1, __pad3 = 0, __pad4 = 0, __flags = 0 '\000',
          __shared = 0 '\000', __rwelision = 0 '\000', __pad2 = 0 '\000',
          __cur_writer = 0}, __size = "#", '\000' <repeats 11 times>,
          "\001", '\000' <repeats 18 times>, __align = 35}

I believe this is state #8 as documented in pthread_rwlock_common.c, and that the deadlock is caused by the reader spinning waiting for the writer to hand-over, but since the reader has higher priority than the writer, the writer will never run. See the for(;;) on line 435 in pthread_rwlock_common.c.

To reproduce, compile and run the attached program rwlock-deadlock.c like this:

    > sudo taskset --cpu-list 1 ./rwlock-deadlock 
    .no progress
    Aborted

(It's not required to use a single core, but it makes it deadlock faster.)

I've confirmed this issue with glibc GIT as of today.
Comment 1 Florian Weimer 2018-06-20 12:34:36 UTC
Duplicate.

*** This bug has been marked as a duplicate of bug 13701 ***
Comment 2 Torvald Riegel 2018-06-20 13:07:05 UTC
It's a duplicate because it seems to rely on the TPS-specific requirements for rwlock, which we do not intend to implement.

Furthermore, rwlocks aren't guaranteed to be safe to use in real-time environments, in contrast to mutex kinds that provide PI.
Comment 3 Torvald Riegel 2018-06-20 13:08:18 UTC

*** This bug has been marked as a duplicate of bug 13701 ***