Bug 23844 - pthread_rwlock_trywrlock results in hang
Summary: pthread_rwlock_trywrlock results in hang
Status: NEW
Alias: None
Product: glibc
Classification: Unclassified
Component: nptl (show other bugs)
Version: 2.26
: P2 normal
Target Milestone: ---
Assignee: Not yet assigned to anyone
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2018-10-30 13:40 UTC by Michael Marxmeier
Modified: 2018-12-12 21:43 UTC (History)
5 users (show)

See Also:
Host:
Target:
Build:
Last reconfirmed: 2018-11-01 00:00:00
fweimer: security-


Attachments
Test case (697 bytes, text/x-csrc)
2018-10-30 13:40 UTC, Michael Marxmeier
Details
simpler trywrlock test case (703 bytes, text/x-csrc)
2018-12-11 14:02 UTC, Rik Prohaska
Details
trywrlock patch (268 bytes, patch)
2018-12-11 14:23 UTC, Rik Prohaska
Details | Diff
test case for tryrdlock hangs (703 bytes, text/x-csrc)
2018-12-11 15:57 UTC, Rik Prohaska
Details
patch for tryrdlock function (418 bytes, patch)
2018-12-12 15:49 UTC, Rik Prohaska
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Michael Marxmeier 2018-10-30 13:40:29 UTC
Created attachment 11373 [details]
Test case

After upgrading from glibc 2.23 to 2.26, we've been seeing what looks like
a hang inside pthread_rwlock calls in our application.

A pthread_rwlock_trywrlock as a "quick path" if a lock is available.  
Otherwise a pthread_rwlock_wrlock is used along with additional 
measurement information.

   if((ecode = pthread_rwlock_trywrlock(&rw_lock[idx]))) {
      if(ecode == EBUSY)
         ecode = pthread_rwlock_wrlock(&rw_lock[idx]);
      if(ecode) {
         perror("pthread_rwlock_wrlock");
         exit(1);
      }
   }

This should behave identically to pthread_rwlock_wrlock().
It does result in a hang with the lock not taken but all
threads blocked in pthread_rwlock calls.

The attached test case makes it easily reproducible.

The lock state looks like this:

 {__data = {__lock = 10, __nr_readers = 0, __readers_wakeup = 2, 
  __writer_wakeup = 3, __nr_readers_queued = 0, __nr_writers_queued = 0, 
  __writer = 0, __shared = 0, __rwelision = 0 '\000', 
  __pad1 = "\000\000\000\000\000\000", __pad2 = 0, __flags = 0}, 
  __size = "\n\000\000\000\000\000\000\000\002\000\000\000\003", '\000' <repeats 42 times>, __align = 10}

Threads are suspended in pthread_rwlock_wrlock or pthread_rwlock_rdlock

  2    Thread 0x7fb876c92700 (LWP 28072) "rwl_g" 0x00007fb87705e585 in pthread_rwlock_wrlock () from /lib64/libpthread.so.0
  3    Thread 0x7fb876491700 (LWP 28073) "rwl_g" 0x00007fb87705e12a in pthread_rwlock_rdlock () from /lib64/libpthread.so.0
  4    Thread 0x7fb875c90700 (LWP 28074) "rwl_g" 0x00007fb87705e585 in pthread_rwlock_wrlock () from /lib64/libpthread.so.0
* 5    Thread 0x7fb87548f700 (LWP 28075) "rwl_g" 0x00007fb87705e63f in pthread_rwlock_wrlock () from /lib64/libpthread.so.0
Comment 1 Carlos O'Donell 2018-11-01 01:54:52 UTC
Thank you for the bug report and reduced test case.

At first I thought this might be "rwlock: Fix explicit hand-over (bug 21298)", but we fixed that *in* 2.26, but I would like you to double check that you have that fix in your sources e.g. commit faf8c066df0d6bccb54bd74dd696eeb65e1b3bbc.

I looked at your test case and it seems entirely reasonable.

It's interesting that the effect of using trylock + EBUSY checking basically ensures that the locking and sleeping on the futex happen in a very narrow band of code.

Interestingly enough with strace it works, ptrace seems to perturb the test enough that it works. Likewise it works sometimes with gdb (also using ptrace), but maybe 1 out of 4 times it fails. So it certainly looks like a race.

With readers doing the same thing, trylock + EBUSY->lock, it succeeds. So perhaps it's only the write side that has a defect.

In the failure mode I see 3 threads in the write lock, and 1 thread in the read lock. I haven't seen anything different. In all 4 threads all the threads have entered the kernel via futex_wait and are stuck there waiting for some thread to wake them up, and that will never happen.

I don't immediately see what's wrong, but I'd have to audit __pthread_rwlock_wrlock_full and __pthread_rwlock_trywrlock before concluding what's wrong.

I've asked Torvald Riegel to have look, he's the primary author of the new rwlock implementation.
Comment 2 Michael Marxmeier 2018-11-02 13:23:59 UTC
Some additional notes

Unfortunately, the problem is not limited to this particular use of 
trywrlock. I can also reproduce the same issue for readers, it takes 
somewhat longer to reproduce but you end up in the same condition.
Also, a more concentional use of trywrlock (check availability of multiple
resources) results in the same hang.

As far as i can see this affects all glibc versions in recent distributions, 
eg. Fedora 28/29, SUSE 15 or Ubuntu 1810. glibc versions before 2.26 seem
not affected, eg. CentOS 7.5.
Comment 3 Rik Prohaska 2018-12-11 14:00:23 UTC
A simpler test case based on the original test case which uses only one lock (and only two threads also reproduces the thread hangs when using the trywrlock function. Since there are only two threads manipulating the same lock, it will be easier to identify the cause of the thread hangs.
Comment 4 Rik Prohaska 2018-12-11 14:02:14 UTC
Created attachment 11448 [details]
simpler trywrlock test case
Comment 5 Rik Prohaska 2018-12-11 14:15:52 UTC
The futex waits on the wrphase futex occur because the 'USED' flag is being discarded by the 'trywrlock' function.  Since the 'USED' flag is discarded, the 'wrunlock' function will not wake the futex when the rwlock is released.  This causes other threads waiting for the rwlock to hang.

The 'wrlock' function does the following on an uncontended lock that is already in the write phase.  (1) the WL bit is atomically set, (2) if the WL was not previously set then this thread holds the exclusive write lock (3) the writers futex is set to 1, and finally (4) the cur_writer is set.  Note that the wrphase futext is not changed in this 'wrlock' code path.

Unfortunately, the 'trywrlock' function ALSO sets the wrphase futex = 1.  This is racy with the contended lock code paths in the 'rdlock' and 'wrlock' functions, and can cause the 'USED' bit to be lost.

The patch just removes the code in 'trywrlock' that sets the wrphase futex to match the 'wrlock' fast path logic.  This patch passes the included test cases.

diff --git a/nptl/pthread_rwlock_trywrlock.c b/nptl/pthread_rwlock_trywrlock.c
index 5a73eba756..b423648ad6 100644
--- a/nptl/pthread_rwlock_trywrlock.c
+++ b/nptl/pthread_rwlock_trywrlock.c
@@ -47,7 +47,6 @@ __pthread_rwlock_trywrlock (pthread_rwlock_t *rwlock)
 	  r | PTHREAD_RWLOCK_WRPHASE | PTHREAD_RWLOCK_WRLOCKED))
 	{
 	  atomic_store_relaxed (&rwlock->__data.__writers_futex, 1);
-	  atomic_store_relaxed (&rwlock->__data.__wrphase_futex, 1);
 	  atomic_store_relaxed (&rwlock->__data.__cur_writer,
 	      THREAD_GETMEM (THREAD_SELF, tid));
 	  return 0;

I have also seen thread hangs when only using 'tryrdlock' (not 'trywrlock') in these test cases.  This indicates that there is also a problem in the 'tryrdlock' function.
Comment 6 Rik Prohaska 2018-12-11 14:23:38 UTC
Created attachment 11449 [details]
trywrlock patch
Comment 7 Torvald Riegel 2018-12-11 15:28:29 UTC
(In reply to richard prohaska from comment #5)
> The futex waits on the wrphase futex occur because the 'USED' flag is being
> discarded by the 'trywrlock' function.  Since the 'USED' flag is discarded,
> the 'wrunlock' function will not wake the futex when the rwlock is released.
> This causes other threads waiting for the rwlock to hang.

I agree with your analysis of what the root cause is, but the fix you propose isn't correct.

trywrlock needs to install a write-phase, but only in some cases.  Essentially, this is:
-	  atomic_store_relaxed (&rwlock->__data.__wrphase_futex, 1);
+	  /* If we started a write phase, we need to enable readers to
+	     wait.  If we did not, we must not change it because other threads
+	     may have set the PTHREAD_RWLOCK_FUTEX_USED in the meantime.  */
+	  if ((r & PTHREAD_RWLOCK_WRPHASE) == 0)
+	    atomic_store_relaxed (&rwlock->__data.__wrphase_futex, 1);

I haven't looked at your tryrdlock case yet.
Comment 8 Rik Prohaska 2018-12-11 15:57:11 UTC
Created attachment 11450 [details]
test case for tryrdlock hangs
Comment 9 Rik Prohaska 2018-12-11 16:01:40 UTC
On my machine, bug23488.rd.c hangs after 1490 invocations.  Both threads waiting on wrphase futex.

Thread 3 (Thread 0x7f28d40ff700 (LWP 11018)):
#0  futex_abstimed_wait (private=0, abstime=0x0, expected=2, futex_word=0x55b0dfcdc068 <rw_lock+8>) at ../sysdeps/unix/sysv/linux/futex-internal.h:172
#1  __pthread_rwlock_wrlock_full (abstime=0x0, rwlock=0x55b0dfcdc060 <rw_lock>) at pthread_rwlock_common.c:803
#2  __GI___pthread_rwlock_wrlock (rwlock=0x55b0dfcdc060 <rw_lock>) at pthread_rwlock_wrlock.c:27
#3  0x000055b0dfcd9282 in runner (arg=0x0) at bug23844.rd.c:42
#4  0x00007f28d4b0df50 in start_thread (arg=<optimized out>) at pthread_create.c:486
#5  0x00007f28d4a1edef in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95

Thread 2 (Thread 0x7f28d4900700 (LWP 11017)):
#0  futex_abstimed_wait (private=0, abstime=0x0, expected=3, futex_word=0x55b0dfcdc068 <rw_lock+8>) at ../sysdeps/unix/sysv/linux/futex-internal.h:172
#1  __pthread_rwlock_rdlock_full (abstime=0x0, rwlock=0x55b0dfcdc060 <rw_lock>) at pthread_rwlock_common.c:446
#2  __GI___pthread_rwlock_rdlock (rwlock=0x55b0dfcdc060 <rw_lock>) at pthread_rwlock_rdlock.c:27
#3  0x000055b0dfcd9300 in runner (arg=0x0) at bug23844.rd.c:52
#4  0x00007f28d4b0df50 in start_thread (arg=<optimized out>) at pthread_create.c:486
#5  0x00007f28d4a1edef in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95

Thread 1 (Thread 0x7f28d4901740 (LWP 11016)):
#0  0x00007f28d4b0f37d in __GI___pthread_timedjoin_ex (threadid=139813341628160, thread_return=0x0, abstime=0x0, block=<optimized out>) at pthread_join_common.c:89
#1  0x000055b0dfcd952b in main (argc=1, argv=0x7ffe734366e8) at bug23844.rd.c:112

p *rwlock
$3 = {__data = {__readers = 10, __writers = 0, __wrphase_futex = 2, __writers_futex = 1, __pad3 = 0, __pad4 = 0, __cur_writer = 0
Comment 10 Rik Prohaska 2018-12-11 18:26:15 UTC
(In reply to Torvald Riegel from comment #7)
> (In reply to richard prohaska from comment #5)
> > The futex waits on the wrphase futex occur because the 'USED' flag is being
> > discarded by the 'trywrlock' function.  Since the 'USED' flag is discarded,
> > the 'wrunlock' function will not wake the futex when the rwlock is released.
> > This causes other threads waiting for the rwlock to hang.
> 
> I agree with your analysis of what the root cause is, but the fix you
> propose isn't correct.
> 
> trywrlock needs to install a write-phase, but only in some cases. 
> Essentially, this is:
> -	  atomic_store_relaxed (&rwlock->__data.__wrphase_futex, 1);
> +	  /* If we started a write phase, we need to enable readers to
> +	     wait.  If we did not, we must not change it because other threads
> +	     may have set the PTHREAD_RWLOCK_FUTEX_USED in the meantime.  */
> +	  if ((r & PTHREAD_RWLOCK_WRPHASE) == 0)
> +	    atomic_store_relaxed (&rwlock->__data.__wrphase_futex, 1);
> 
> I haven't looked at your tryrdlock case yet.

I agree with this patch since it matches the wrlock logic that changes the read phase to write phase.
Comment 11 Rik Prohaska 2018-12-11 21:44:37 UTC
Lets compare the logic in tryrdlock and rdlock WRT the transition from write phase to read phase.  In tryrdlock, the wrphase futex is just set to 0.  In rtdlock, the wrphase futext is exchanged with 0, and if the USED bit was set then the futex is awakened.  I wonder if this difference is responsible for the subsequent hangs.
Comment 12 Rik Prohaska 2018-12-12 15:49:09 UTC
Created attachment 11457 [details]
patch for tryrdlock function

added logic to tryrdlock that exists in rdlock to wake the wrphase futex when the  rwlock phase transistions from write phase to read phase.  this logic wakes up other threads stalled in the rdlock function. 

this patch can successfully run the tryrdlock test cases.
Comment 13 Torvald Riegel 2018-12-12 21:43:37 UTC
(In reply to richard prohaska from comment #12)
> Created attachment 11457 [details]
> patch for tryrdlock function
> 
> added logic to tryrdlock that exists in rdlock to wake the wrphase futex
> when the  rwlock phase transistions from write phase to read phase.  this
> logic wakes up other threads stalled in the rdlock function. 
> 
> this patch can successfully run the tryrdlock test cases.

This makes sense.  Thank you.  The trywrlock fix and this one (with changed comments) will get posted upstream.