This is the mail archive of the
libc-alpha@sourceware.org
mailing list for the glibc project.
[PATCH] nptl: Fix pthread_rwlock_try*lock stalls [BZ #23844]
- From: Carlos O'Donell <carlos at redhat dot com>
- To: Torvald Riegel <triegel at redhat dot com>, Rik Prohaska <prohaska7 at gmail dot com>, GNU C Library <libc-alpha at sourceware dot org>, Siddhesh Poyarekar <siddhesh at gotplt dot org>
- Date: Mon, 21 Jan 2019 23:11:52 -0500
- Subject: [PATCH] nptl: Fix pthread_rwlock_try*lock stalls [BZ #23844]
Torvald, Rik,
Here is the full fix for bug 23844 with two regression test, one
for each of the trylock cases.
May I please use Signed-off-by for both of you? We follow the
same meaning as Linux kernel. The final fix incorporates parts
of both your patches, and mine.
Rik, I don't see an FSF copyright on file for you. In the future
if you want to contribute more to the project this is going to be
a requirement for us to be able to review your patches. As it is
your contributions are below the threshold of ~15 lines, and so
are OK for now. Thank you very much for your help and insight
into the fix.
I would really like to get this in to glibc 2.29, and so would
appreciate a prompt response. I've spent a lot of time reviewing
the fixes you and Torvald proposed and I believe they are correct.
I could not find any other defects in this area of the implementation
and I wrote my entire analysis in comment #14 of bug 23844.
I would appreciate any review of the following commit message,
fixes, and final tests.
~~~~ Commit message ~~~
For a full analysis of both the pthread_rwlock_tryrdlock() stall
and the pthread_rwlock_trywrlock() stall see:
https://sourceware.org/bugzilla/show_bug.cgi?id=23844#c14
In the pthread_rwlock_trydlock() function we fail to inspect for
PTHREAD_RWLOCK_FUTEX_USED in __wrphase_futex and wake the waiting
readers.
In the pthread_rwlock_trywrlock() function we write 1 to
__wrphase_futex and loose the setting of the PTHREAD_RWLOCK_FUTEX_USED
bit, again failing to wake waiting readers during unlock.
The fix in the case of pthread_rwlock_trydlock() is to check for
PTHREAD_RWLOCK_FUTEX_USED and wake the readers.
The fix in the case of pthread_rwlock_trywrlock() is to only write
1 to __wrphase_futex if we installed the write phase, since all other
readers would be spinning waiting for this step.
We add two new tests, one exercises the stall for
pthread_rwlock_trywrlock() which is easy to exercise, and one exercises
the stall for pthread_rwlock_trydlock() which is harder to exercise.
The pthread_rwlock_trywrlock() test fails consistently without the fix,
and passes after. The pthread_rwlock_tryrdlock() test fails roughly
5-10% of the time without the fix, and passes all the time after.
Signed-off-by: Carlos O'Donell <carlos@redhat.com>
Signed-off-by: Torvald Riegel <triegel@redhat.com>
Signed-off-by: Rik Prohaska <prohaska7@gmail.com>
---
ChangeLog | 17 +++++
nptl/Makefile | 3 +-
nptl/pthread_rwlock_tryrdlock.c | 25 ++++---
nptl/pthread_rwlock_trywrlock.c | 9 ++-
nptl/tst-rwlock-tryrdlock-stall.c | 112 ++++++++++++++++++++++++++++++
nptl/tst-rwlock-trywrlock-stall.c | 108 ++++++++++++++++++++++++++++
support/Makefile | 1 +
support/xpthread_rwlock_destroy.c | 26 +++++++
support/xthread.h | 1 +
9 files changed, 291 insertions(+), 11 deletions(-)
create mode 100644 nptl/tst-rwlock-tryrdlock-stall.c
create mode 100644 nptl/tst-rwlock-trywrlock-stall.c
create mode 100644 support/xpthread_rwlock_destroy.c
diff --git a/ChangeLog b/ChangeLog
index 834fca6fdc..6ecb020569 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,3 +1,20 @@
+2019-01-21 Carlos O'Donell <carlos@redhat.com>
+ Torvald Riegel <triegel@redhat.com>
+ Rik Prohaska <prohaska7@gmail.com>
+
+ [BZ# 23844]
+ * nptl/Makefile (tests): Add tst-rwlock-tryrdlock-stall, and
+ tst-rwlock-trywrlock-stall.
+ * nptl/pthread_rwlock_tryrdlock.c (__pthread_rwlock_tryrdlock):
+ Wake waiters if PTHREAD_RWLOCK_FUTEX_USED is set.
+ * nptl/pthread_rwlock_trywrlock.c (__pthread_rwlock_trywrlock):
+ Set __wrphase_fute to 1 only if we started the write phase.
+ * nptl/tst-rwlock-tryrdlock-stall.c: New file.
+ * nptl/tst-rwlock-trywrlock-stall.c: New file.
+ * support/Makefile (libsupport-routines): Add xpthread_rwlock_destroy.
+ * support/xpthread_rwlock_destroy.c: New file.
+ * support/xthread.h: Declare xpthread_rwlock_destroy.
+
2019-01-21 H.J. Lu <hongjiu.lu@intel.com>
[BZ# 24097]
diff --git a/nptl/Makefile b/nptl/Makefile
index 340282c6cb..0e316edfac 100644
--- a/nptl/Makefile
+++ b/nptl/Makefile
@@ -319,7 +319,8 @@ tests = tst-attr1 tst-attr2 tst-attr3 tst-default-attr \
tst-cnd-basic tst-mtx-trylock tst-cnd-broadcast \
tst-cnd-timedwait tst-thrd-detach tst-mtx-basic tst-thrd-sleep \
tst-mtx-recursive tst-tss-basic tst-call-once tst-mtx-timedlock \
- tst-rwlock-pwn
+ tst-rwlock-pwn \
+ tst-rwlock-tryrdlock-stall tst-rwlock-trywrlock-stall
tests-internal := tst-rwlock19 tst-rwlock20 \
tst-sem11 tst-sem12 tst-sem13 \
diff --git a/nptl/pthread_rwlock_tryrdlock.c b/nptl/pthread_rwlock_tryrdlock.c
index 368862ff07..2f94f17f36 100644
--- a/nptl/pthread_rwlock_tryrdlock.c
+++ b/nptl/pthread_rwlock_tryrdlock.c
@@ -94,15 +94,22 @@ __pthread_rwlock_tryrdlock (pthread_rwlock_t *rwlock)
/* Same as in __pthread_rwlock_rdlock_full:
We started the read phase, so we are also responsible for
updating the write-phase futex. Relaxed MO is sufficient.
- Note that there can be no other reader that we have to wake
- because all other readers will see the read phase started by us
- (or they will try to start it themselves); if a writer started
- the read phase, we cannot have started it. Furthermore, we
- cannot discard a PTHREAD_RWLOCK_FUTEX_USED flag because we will
- overwrite the value set by the most recent writer (or the readers
- before it in case of explicit hand-over) and we know that there
- are no waiting readers. */
- atomic_store_relaxed (&rwlock->__data.__wrphase_futex, 0);
+ We have to do the same steps as a writer would when handing over the
+ read phase to use because other readers cannot distinguish between
+ us and the writer.
+ Note that __pthread_rwlock_tryrdlock callers will not have to be
+ woken up because they will either see the read phase started by us
+ or they will try to start it themselves; however, callers of
+ __pthread_rwlock_rdlock_full just increase the reader count and then
+ check what state the lock is in, so they cannot distinguish between
+ us and a writer that acquired and released the lock in the
+ meantime. */
+ if ((atomic_exchange_relaxed (&rwlock->__data.__wrphase_futex, 0)
+ & PTHREAD_RWLOCK_FUTEX_USED) != 0)
+ {
+ int private = __pthread_rwlock_get_private (rwlock);
+ futex_wake (&rwlock->__data.__wrphase_futex, INT_MAX, private);
+ }
}
return 0;
diff --git a/nptl/pthread_rwlock_trywrlock.c b/nptl/pthread_rwlock_trywrlock.c
index fd37a71ce4..fae475cc70 100644
--- a/nptl/pthread_rwlock_trywrlock.c
+++ b/nptl/pthread_rwlock_trywrlock.c
@@ -46,8 +46,15 @@ __pthread_rwlock_trywrlock (pthread_rwlock_t *rwlock)
&rwlock->__data.__readers, &r,
r | PTHREAD_RWLOCK_WRPHASE | PTHREAD_RWLOCK_WRLOCKED))
{
+ /* We have become the primary writer and we cannot have shared
+ the PTHREAD_RWLOCK_FUTEX_USED flag with someone else, so we
+ can simply enable blocking (see full wrlock code). */
atomic_store_relaxed (&rwlock->__data.__writers_futex, 1);
- atomic_store_relaxed (&rwlock->__data.__wrphase_futex, 1);
+ /* If we started a write phase, we need to enable readers to
+ wait. If we did not, we must not change it because other threads
+ may have set the PTHREAD_RWLOCK_FUTEX_USED in the meantime. */
+ if ((r & PTHREAD_RWLOCK_WRPHASE) == 0)
+ atomic_store_relaxed (&rwlock->__data.__wrphase_futex, 1);
atomic_store_relaxed (&rwlock->__data.__cur_writer,
THREAD_GETMEM (THREAD_SELF, tid));
return 0;
diff --git a/nptl/tst-rwlock-tryrdlock-stall.c b/nptl/tst-rwlock-tryrdlock-stall.c
new file mode 100644
index 0000000000..c176350ac1
--- /dev/null
+++ b/nptl/tst-rwlock-tryrdlock-stall.c
@@ -0,0 +1,112 @@
+/* Bug 23844: Test for pthread_rwlock_tryrdlock stalls.
+ Copyright (C) 2019 Free Software Foundation, Inc.
+ This file is part of the GNU C Library.
+
+ The GNU C Library is free software; you can redistribute it and/or
+ modify it under the terms of the GNU Lesser General Public
+ License as published by the Free Software Foundation; either
+ version 2.1 of the License, or (at your option) any later version.
+
+ The GNU C Library is distributed in the hope that it will be useful,
+ but WITHOUT ANY WARRANTY; without even the implied warranty of
+ MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ Lesser General Public License for more details.
+
+ You should have received a copy of the GNU Lesser General Public
+ License along with the GNU C Library; if not, see
+ <http://www.gnu.org/licenses/>. */
+
+/* For a full analysis see comment:
+ https://sourceware.org/bugzilla/show_bug.cgi?id=23844#c14
+
+ The stall is caused by pthread_rwlock_tryrdlock failing to check
+ that PTHREAD_RWLOCK_FUTEX_USED is set in the __wrphase_futex futex
+ and then waking the futex.
+
+ The fix for bug 23844 ensures that waiters on __wrphase_futex are
+ correctly woken. Before the fix the test stalls as readers can
+ wait forever on __wrphase_futex. */
+
+#include <stdio.h>
+#include <stdlib.h>
+#include <unistd.h>
+#include <pthread.h>
+#include <support/xthread.h>
+#include <errno.h>
+
+/* We need only one lock to reproduce the issue. We will need multiple
+ threads to get the exact case where we have a read, try, and unlock
+ all interleaving to produce the case where the readers are waiting
+ and the try fails to wake them. */
+pthread_rwlock_t onelock;
+
+/* The number of threads is arbitrary but empirically chosen to have
+ enough threads that we see the condition where waiting readers are
+ not woken by a successful tryrdlock. */
+#define NTHREADS 32
+
+_Atomic int do_exit;
+
+void *
+run_loop (void *arg)
+{
+ int i = 0, ret, rw;
+ while (!do_exit)
+ {
+ /* Arbitrarily choose if we are the writer or reader. Choose a
+ high enough ratio of readers to writers to make it likely
+ that readers block (and eventually are susceptable to
+ stalling). */
+ rw = i % 8;
+ /* If we are a writer, take the write lock, and then unlock.
+ If we are a reader, try the lock, then lock, then unlock. */
+ if (rw)
+ xpthread_rwlock_wrlock (&onelock);
+ else
+ {
+ if ((ret = pthread_rwlock_tryrdlock (&onelock)) != 0)
+ {
+ if (ret == EBUSY)
+ xpthread_rwlock_rdlock (&onelock);
+ else
+ exit (EXIT_FAILURE);
+ }
+ }
+ /* Thread does some work and then unlocks. */
+ xpthread_rwlock_unlock (&onelock);
+ i++;
+ }
+ return NULL;
+}
+
+int
+do_test (void)
+{
+ int i;
+ pthread_t tids[NTHREADS];
+ xpthread_rwlock_init (&onelock, NULL);
+ for (i = 0; i < NTHREADS; i++)
+ tids[i] = xpthread_create (NULL, run_loop, NULL);
+ /* Run for some amount of time. Empirically speaking exercising
+ the stall via pthread_rwlock_tryrdlock is much harder, and on
+ a 3.5GHz 4 core x86_64 VM system it takes somewhere around
+ 20-200s to stall, approaching 100% stall past 200s. We can't
+ wait that long for a regression test so we just test for 20s,
+ and expect the stall to happen with a 5-10% chance (enough for
+ developers to see). */
+ sleep (20);
+ /* Then exit. */
+ printf ("INFO: Exiting...");
+ fflush (stdout);
+ do_exit = 1;
+ /* If any readers stalled then we will timeout waiting for them. */
+ for (i = 0; i < NTHREADS; i++)
+ xpthread_join (tids[i]);
+ printf ("done.\n");
+ xpthread_rwlock_destroy (&onelock);
+ printf ("PASS: No pthread_rwlock_tryrdlock stalls detected.\n");
+ return 0;
+}
+
+#define TIMEOUT (30)
+#include <support/test-driver.c>
diff --git a/nptl/tst-rwlock-trywrlock-stall.c b/nptl/tst-rwlock-trywrlock-stall.c
new file mode 100644
index 0000000000..f2b5ddb6b4
--- /dev/null
+++ b/nptl/tst-rwlock-trywrlock-stall.c
@@ -0,0 +1,108 @@
+/* Bug 23844: Test for pthread_rwlock_trywrlock stalls.
+ Copyright (C) 2019 Free Software Foundation, Inc.
+ This file is part of the GNU C Library.
+
+ The GNU C Library is free software; you can redistribute it and/or
+ modify it under the terms of the GNU Lesser General Public
+ License as published by the Free Software Foundation; either
+ version 2.1 of the License, or (at your option) any later version.
+
+ The GNU C Library is distributed in the hope that it will be useful,
+ but WITHOUT ANY WARRANTY; without even the implied warranty of
+ MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ Lesser General Public License for more details.
+
+ You should have received a copy of the GNU Lesser General Public
+ License along with the GNU C Library; if not, see
+ <http://www.gnu.org/licenses/>. */
+
+/* For a full analysis see comment:
+ https://sourceware.org/bugzilla/show_bug.cgi?id=23844#c14
+
+ The stall is caused by pthread_rwlock_trywrlock setting
+ __wrphase_futex futex to 1 and loosing the
+ PTHREAD_RWLOCK_FUTEX_USED bit.
+
+ The fix for bug 23844 ensures that waiters on __wrphase_futex are
+ correctly woken. Before the fix the test stalls as readers can
+ wait forever on __wrphase_futex. */
+
+#include <stdio.h>
+#include <stdlib.h>
+#include <unistd.h>
+#include <pthread.h>
+#include <support/xthread.h>
+#include <errno.h>
+
+/* We need only one lock to reproduce the issue. We will need multiple
+ threads to get the exact case where we have a read, try, and unlock
+ all interleaving to produce the case where the readers are waiting
+ and the try clears the PTHREAD_RWLOCK_FUTEX_USED bit and a
+ subsequent unlock fails to wake them. */
+pthread_rwlock_t onelock;
+
+/* The number of threads is arbitrary but empirically chosen to have
+ enough threads that we see the condition where waiting readers are
+ not woken by a successful unlock. */
+#define NTHREADS 32
+
+_Atomic int do_exit;
+
+void *
+run_loop (void *arg)
+{
+ int i = 0, ret, rw;
+ while (!do_exit)
+ {
+ /* Arbitrarily choose if we are the writer or reader. Choose a
+ high enough ratio of readers to writers to make it likely
+ that readers block (and eventually are susceptable to
+ stalling). */
+ rw = i % 8;
+ /* If we are a writer, take the write lock, and then unlock.
+ If we are a reader, try the lock, then lock, then unlock. */
+ if (rw)
+ {
+ if ((ret = pthread_rwlock_trywrlock (&onelock)) != 0)
+ {
+ if (ret == EBUSY)
+ xpthread_rwlock_wrlock (&onelock);
+ else
+ exit (EXIT_FAILURE);
+ }
+ }
+ else
+ xpthread_rwlock_rdlock (&onelock);
+ /* Thread does some work and then unlocks. */
+ xpthread_rwlock_unlock (&onelock);
+ i++;
+ }
+ return NULL;
+}
+
+int
+do_test (void)
+{
+ int i;
+ pthread_t tids[NTHREADS];
+ xpthread_rwlock_init (&onelock, NULL);
+ for (i = 0; i < NTHREADS; i++)
+ tids[i] = xpthread_create (NULL, run_loop, NULL);
+ /* Run for some amount of time. The pthread_rwlock_tryrwlock stall
+ is very easy to trigger and happens in seconds under the test
+ conditions. */
+ sleep (10);
+ /* Then exit. */
+ printf ("INFO: Exiting...");
+ fflush (stdout);
+ do_exit = 1;
+ /* If any readers stalled then we will timeout waiting for them. */
+ for (i = 0; i < NTHREADS; i++)
+ xpthread_join (tids[i]);
+ printf ("done.\n");
+ xpthread_rwlock_destroy (&onelock);
+ printf ("PASS: No pthread_rwlock_tryrwlock stalls detected.\n");
+ return 0;
+}
+
+#include <support/test-driver.c>
diff --git a/support/Makefile b/support/Makefile
index 432cf2fe6c..c15b93647c 100644
--- a/support/Makefile
+++ b/support/Makefile
@@ -129,6 +129,7 @@ libsupport-routines = \
xpthread_mutexattr_settype \
xpthread_once \
xpthread_rwlock_init \
+ xpthread_rwlock_destroy \
xpthread_rwlock_rdlock \
xpthread_rwlock_unlock \
xpthread_rwlock_wrlock \
diff --git a/support/xpthread_rwlock_destroy.c b/support/xpthread_rwlock_destroy.c
new file mode 100644
index 0000000000..6d6e953569
--- /dev/null
+++ b/support/xpthread_rwlock_destroy.c
@@ -0,0 +1,26 @@
+/* pthread_rwlock_destroy with error checking.
+ Copyright (C) 2019 Free Software Foundation, Inc.
+ This file is part of the GNU C Library.
+
+ The GNU C Library is free software; you can redistribute it and/or
+ modify it under the terms of the GNU Lesser General Public
+ License as published by the Free Software Foundation; either
+ version 2.1 of the License, or (at your option) any later version.
+
+ The GNU C Library is distributed in the hope that it will be useful,
+ but WITHOUT ANY WARRANTY; without even the implied warranty of
+ MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ Lesser General Public License for more details.
+
+ You should have received a copy of the GNU Lesser General Public
+ License along with the GNU C Library; if not, see
+ <http://www.gnu.org/licenses/>. */
+
+#include <support/xthread.h>
+
+void
+xpthread_rwlock_destroy (pthread_rwlock_t *rwlock)
+{
+ xpthread_check_return ("pthread_rwlock_destroy",
+ pthread_rwlock_destroy (rwlock));
+}
diff --git a/support/xthread.h b/support/xthread.h
index 47c23235f3..9fe1f68b3b 100644
--- a/support/xthread.h
+++ b/support/xthread.h
@@ -84,6 +84,7 @@ void xpthread_rwlockattr_setkind_np (pthread_rwlockattr_t *attr, int pref);
void xpthread_rwlock_wrlock (pthread_rwlock_t *rwlock);
void xpthread_rwlock_rdlock (pthread_rwlock_t *rwlock);
void xpthread_rwlock_unlock (pthread_rwlock_t *rwlock);
+void xpthread_rwlock_destroy (pthread_rwlock_t *rwlock);
__END_DECLS
--
2.20.1
--
Cheers,
Carlos.