This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH] nptl: Fix pthread_rwlock_try*lock stalls [BZ #23844]


* Carlos O'Donell:

> +/* For a full analysis see comment:
> +   https://sourceware.org/bugzilla/show_bug.cgi?id=23844#c14

In the past, we had issues with bug trackers going away and such
references becoming stale.

> +
> +   The stall is caused by pthread_rwlock_tryrdlock failing to check
> +   that PTHREAD_RWLOCK_FUTEX_USED is set in the __wrphase_futex futex
> +   and then waking the futex.
> +
> +   The fix for bug 23844 ensures that waiters on __wrphase_futex are
> +   correctly woken.  Before the fix the test stalls as readers can
> +   wait forever on __wrphase_futex.  */
> +
> +#include <stdio.h>
> +#include <stdlib.h>
> +#include <unistd.h>
> +#include <pthread.h>
> +#include <support/xthread.h>
> +#include <errno.h>
> +
> +/* We need only one lock to reproduce the issue. We will need multiple
> +   threads to get the exact case where we have a read, try, and unlock
> +   all interleaving to produce the case where the readers are waiting
> +   and the try fails to wake them.  */
> +pthread_rwlock_t onelock;
> +
> +/* The number of threads is arbitrary but empirically chosen to have
> +   enough threads that we see the condition where waiting readers are
> +   not woken by a successful tryrdlock.  */
> +#define NTHREADS 32
> +
> +_Atomic int do_exit;
> +
> +void *
> +run_loop (void *arg)
> +{
> +  int i = 0, ret, rw;
> +  while (!do_exit)
> +    {
> +      /* Arbitrarily choose if we are the writer or reader.  Choose a
> +	 high enough ratio of readers to writers to make it likely
> +	 that readers block (and eventually are susceptable to
> +	 stalling).  */
> +      rw = i % 8;
> +      /* If we are a writer, take the write lock, and then unlock.
> +	 If we are a reader, try the lock, then lock, then unlock.  */
> +      if (rw)

rw != 0 (or (i % 8) != 0).  Note sure if the variable actually increases
clarity here.

> +  /* Run for some amount of time.  Empirically speaking exercising
> +     the stall via pthread_rwlock_tryrdlock is much harder, and on
> +     a 3.5GHz 4 core x86_64 VM system it takes somewhere around
> +     20-200s to stall, approaching 100% stall past 200s.  We can't
> +     wait that long for a regression test so we just test for 20s,
> +     and expect the stall to happen with a 5-10% chance (enough for
> +     developers to see).  */
> +  sleep (20);

Use nanosleep or usleep to avoid signals.

> +  /* Then exit.  */
> +  printf ("INFO: Exiting...");
> +  fflush (stdout);

Why no trailing \n?  It will just result in garbled output if there is a
problem.

> +  do_exit = 1;
> +  /* If any readers stalled then we will timeout waiting for them.  */
> +  for (i = 0; i < NTHREADS; i++)
> +    xpthread_join (tids[i]);
> +  printf ("done.\n");
> +  xpthread_rwlock_destroy (&onelock);
> +  printf ("PASS: No pthread_rwlock_tryrdlock stalls detected.\n");
> +  return 0;
> +}
> +
> +#define TIMEOUT (30)

Parentheses are unnecessary.

Most of these comments also apply to ther other test.

I have not reviewed the code.

Thanks,
Florian


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]