The pthread_rwlock_rdlock() together with thread scheduling priority set to SCHED_FIFO or SCHED_RR should prefer writer locks. This doesn't seem to work, at least corresponding LTP test fails and I couldn't find anything wrong about it.
The corresponding test source:
The corresponding POSIX definition:
The original SUSE bug:
discussion has moved to the austin group tracker now
According to austin group's discussion results, it seems that write locks should be implemented to take precedence before the read locks.
Whether is this considered as a glibc bug or should be fixed? Thanks.
Precisely, if Thread Execution Scheduling is supported (and glibc claims it is), and if the threads use SCHED_FIFO / SCHED_RR / SCHED_SPORADIC, pthread_rwlock_rdlock() must prefer blocked writers if and only if their priority is higher or equal to that of the reader; otherwise, the reader is preferred:
The current implementation does not guarantee that; instead, it prefers readers if other readers have already acquired the lock, or neither readers nor writers have acquired the lock. The former seems to be, intuitively, the right behavior given that recursive rdlock acquisitions are allowed also.
Implementing the special requirements for Thread Execution Scheduling seems difficult, especially in an efficient way, and given the futex facilities we have today and the role userspace has when using them (e.g., the rwlock implementation would need to track the maximum priority of all blocked writers, yet we'd need to still support process-shared rwlocks). Therefore, unless there is strong demand for this feature, I don't think it's worthwhile to spend significant time on this.
There is also a second test case from the Linux test project, This one considers readers and writers that have the same assigned priority:
$ wget https://raw.githubusercontent.com/linux-test-project/ltp/master/testcases/open_posix_testsuite/conformance/interfaces/pthread_rwlock_rdlock/2-2.c
$ wget https://raw.githubusercontent.com/linux-test-project/ltp/master/testcases/open_posix_testsuite/include/posixtest.h
$ gcc -O -Wall 2-2.c -lpthread
main: attempt read lock
main: acquired read lock
main: create wr_thread, with priority: 2
wr_thread: attempt write lock
main: create rd_thread, with priority: 2
rd_thread: attempt read lock
rd_thread: acquired read lock
rd_thread: unlock read lock
Test FAILED: rd_thread did not block on read lock, when a reader owns the lock, and an equal priority writer is waiting for the lock
Re comment 3:
> if Thread Execution Scheduling is supported (and glibc claims it is)
Yes. The symbol _POSIX_THREAD_PRIORITY_SCHEDULING is defined by glibc. It is actually defined in <bits/posix_opt.h> with a comment:
/* We provide priority scheduling for threads. */
#define _POSIX_THREAD_PRIORITY_SCHEDULING 200809L
The value is a positive constant.
Per http://pubs.opengroup.org/onlinepubs/007904875/basedefs/xbd_chap02.html and http://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap02.html this means that glibc supports the Thread Priority Scheduling option of POSIX at compile time and at run time.
The test activates the SCHED_FIFO on all of its threads. Therefore the premises of the sentence "If the Thread Execution Scheduling option is supported, and the threads involved in the lock are executing with the scheduling policies SCHED_FIFO or SCHED_RR, the calling thread shall not acquire the lock if a writer holds the lock or if writers of higher or equal priority are blocked on the lock; otherwise, the calling thread shall acquire the lock." from http://pubs.opengroup.org/onlinepubs/9699919799/functions/pthread_rwlock_rdlock.html are fulfilled. And the writer and readers in this test case have equal priority.
> Implementing the special requirements for Thread Execution Scheduling seems difficult, especially in an efficient way
Implementing the handling of readers and writers of different priority (test case 2-1.c) is some work, indeed. But the handling of readers and writers of equal priority is nearly complete in glibc: The non-portable flag PTHREAD_RWLOCK_PREFER_WRITER_NONRECURSIVE_NP does it. Only the functions pthread_rwlock_init and pthread_rwlockattr_init would have to be changed.
rwlocks without the "writers get the rwlock in preference to readers [of the same priority]" guarantee are quite unreliable (no way to guarantee that writers will not get starve, except if there's only 1 reader thread - but then one can just use a normal mutex as well), I'm arguing in http://lists.gnu.org/archive/html/bug-gnulib/2017-01/msg00037.html
By the way, there's also the bug that PTHREAD_RWLOCK_PREFER_WRITER_NP does not behave like one would intuitively except: it still prefers readers. See the BUGS section of http://man7.org/linux/man-pages/man3/pthread_rwlockattr_setkind_np.3.html. (I checked the glibc code: the man page is right.)
Raised this as a POSIX bug:
(In reply to Bruno Haible from comment #5)
> By the way, there's also the bug that PTHREAD_RWLOCK_PREFER_WRITER_NP does
> not behave like one would intuitively except: it still prefers readers. See
> the BUGS section of
> (I checked the glibc code: the man page is right.)
I have built a new, more scalable rwlock last year that should be committed soon and that supports this mode (though there's not a lot the implementation can do because of recursive rdlocks being allowed -- and we're certainly not going to track which reader exactly has acquired a particular rdlock).
(In reply to Torvald Riegel from comment #7)
> we're certainly not going to track which reader exactly has acquired a
> particular rdlock
It could be done by allocating a variable in thread-local storage (pthread_key_create). But I agree that such an implementation would be costly in pthread_rwlock_init (since pthread_key_create does a linear search).
(In reply to Bruno Haible from comment #8)
> (In reply to Torvald Riegel from comment #7)
> > we're certainly not going to track which reader exactly has acquired a
> > particular rdlock
> It could be done by allocating a variable in thread-local storage
> (pthread_key_create). But I agree that such an implementation would be
> costly in pthread_rwlock_init (since pthread_key_create does a linear
It's not just that. If a thread acquires more than one rwlock as a reader, it has to track a set of acquired rwlocks, and query inclusion in this set on every rdlock call.
(In reply to Torvald Riegel from comment #6)
> Raised this as a POSIX bug:
This POSIX bug has been resolved. The essential sentence in the pthread_rwlock_rdlock specification now reads:
[TPS] If the Thread Execution Scheduling option is supported, and the threads involved in the lock are executing with the scheduling policies SCHED_FIFO or SCHED_RR, the calling thread shall not acquire the lock if a writer holds the lock or if the calling thread does not already hold a read lock and writers of higher or equal priority are blocked on the lock; otherwise, the calling thread shall acquire the lock.
Therefore, the "current implementation" that works as described in comment #3 is still not POSIX compliant.
The Austin Group has fixed the obvious inconsistency related to recursive acquisition, but it hasn't addressed the design / performance issue (see my initial description the POSIX bug report). It even clarifies that recursive rdlocks are allowed, which is one of the things that cause the performance problem.
We should not penalize the performance in the common case for the TPS niche case. I'm not aware of a way to avoid this performance penalty, and unless someone proposes a solution that avoids it, we should simply deviate from the POSIX requirements in this case.
We can set this to WONTFIX once we have documented that glibcs deviates from POSIX in this case, and why.
*** Bug 23229 has been marked as a duplicate of this bug. ***
I agree with Torvald in comment #11, we should document our deviation from POSIX here both in rwlock and in other intefaces like scheduling where we don't define SCHED_SPORADIC but are required to do so for TPS, but don't because Linux's model for scheduling doesn't match the POSIX one and it never will.
The current manual does not document the rwlock functions, so there is no work to be done here.