This is the mail archive of the
libc-alpha@sourceware.org
mailing list for the glibc project.
Re: [COMMITTED PATCH] NaCl: Make thread exit wake pthread_join.
- From: Torvald Riegel <triegel at redhat dot com>
- To: Roland McGrath <roland at hack dot frob dot com>
- Cc: "GNU C. Library" <libc-alpha at sourceware dot org>
- Date: Fri, 29 May 2015 11:19:23 +0200
- Subject: Re: [COMMITTED PATCH] NaCl: Make thread exit wake pthread_join.
- Authentication-results: sourceware.org; auth=none
- References: <20150528224149 dot 504A72C3B00 at topped-with-meat dot com>
On Thu, 2015-05-28 at 15:41 -0700, Roland McGrath wrote:
> * sysdeps/nacl/exit-thread.h (__exit_thread): If not detached,
> set THREAD_SELF->tid to a magic value and futex-wake it.
> Pass its address to the thread_exit system call.
> * sysdeps/nacl/pthread-pids.h (__nacl_get_tid): Assert that TID's low
> bit is clear.
> * sysdeps/nacl/lowlevellock.h: New file.
> * sysdeps/nacl/lll_timedwait_tid.c: New file.
>
> diff --git a/sysdeps/nacl/exit-thread.h b/sysdeps/nacl/exit-thread.h
> index a08a5b1..c809405 100644
> --- a/sysdeps/nacl/exit-thread.h
> +++ b/sysdeps/nacl/exit-thread.h
> @@ -16,8 +16,11 @@
> License along with the GNU C Library; if not, see
> <http://www.gnu.org/licenses/>. */
>
> -#include <stddef.h>
> +#include <assert.h>
> +#include <atomic.h>
> +#include <lowlevellock.h>
> #include <nacl-interfaces.h>
> +#include <nptl/pthreadP.h>
>
> /* This causes the current thread to exit, without affecting other
> threads in the process if there are any. If there are no other
> @@ -26,7 +29,49 @@
> static inline void __attribute__ ((noreturn, always_inline, unused))
> __exit_thread (void)
> {
> - __nacl_irt_thread.thread_exit (NULL);
> + struct pthread *pd = THREAD_SELF;
> +
> + /* The generic logic for pthread_join and stack/descriptor reuse is
> + based on the Linux kernel feature that will clear and futex-wake
> + a designated address as a final part of thread teardown. Correct
> + synchronization relies on the fact that these happen only after
> + there is no possibility of user code touching or examining the
> + late thread's stack.
> +
> + The NaCl system interface implements half of this: it clears a
> + word after the thread's user stack is safely dead, but it does
> + not futex-wake the location. So, some shenanigans are required.
> + We change and futex-wake the location here, so as to wake up any
> + blocked pthread_join (i.e. lll_wait_tid) or pthread_timedjoin_np
> + (i.e. lll_timedwait_tid). However, that's before we have safely
> + vacated the stack. So instead of clearing the location, we set
> + it to a special magic value, NACL_EXITING_TID. This counts as a
> + "live thread" value for all the generic logic, but is recognized
> + specially in lll_wait_tid and lll_timedwait_tid (lowlevellock.h).
> + Once it has this value, lll_wait_tid will busy-wait for the
> + location to be cleared to zero by the NaCl system code. Only then
> + is the stack actually safe to reuse. */
> +
> + if (!IS_DETACHED (pd))
> + {
> + /* The magic value must not be one that could ever be a valid
> + TID value. See pthread-pids.h about the low bit. */
> + assert (NACL_EXITING_TID & 1);
> +
> + /* The magic value must not be one that has the "free" flag
> + (i.e. sign bit) set. If that bit is set, then the
> + descriptor could be reused for a new thread. */
> + assert (NACL_EXITING_TID > 0);
> +
> + atomic_store_relaxed (&pd->tid, NACL_EXITING_TID);
Maybe add a comment why the relaxed store is sufficient here (it's just
a flag for busy-waiting, it doesn't imply a happens-before for any other
state change). This may be different from the store of the final value,
for which I'm wondering whether it needs to be a release MO store. See
below for comments on the (acquire?) load side of this.
> + lll_futex_wake (&pd->tid, 1, LLL_PRIVATE);
> + }
> +
> + /* This clears PD->tid some time after the thread stack can never
> + be touched again. Unfortunately, it does not also do a
> + futex-wake at that time (as Linux does via CLONE_CHILD_CLEARTID
> + and set_tid_address). So lll_wait_tid does some busy-waiting. */
> + __nacl_irt_thread.thread_exit (&pd->tid);
>
> /* That never returns unless something is severely and unrecoverably wrong.
> If it ever does, try to make sure we crash. */
> diff --git a/sysdeps/nacl/lll_timedwait_tid.c b/sysdeps/nacl/lll_timedwait_tid.c
> new file mode 100644
> index 0000000..ecaf0b1
> --- /dev/null
> +++ b/sysdeps/nacl/lll_timedwait_tid.c
> @@ -0,0 +1,61 @@
> +/* Timed waiting for thread death. NaCl version.
> + Copyright (C) 2015 Free Software Foundation, Inc.
> + This file is part of the GNU C Library.
> +
> + The GNU C Library is free software; you can redistribute it and/or
> + modify it under the terms of the GNU Lesser General Public
> + License as published by the Free Software Foundation; either
> + version 2.1 of the License, or (at your option) any later version.
> +
> + The GNU C Library is distributed in the hope that it will be useful,
> + but WITHOUT ANY WARRANTY; without even the implied warranty of
> + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
> + Lesser General Public License for more details.
> +
> + You should have received a copy of the GNU Lesser General Public
> + License along with the GNU C Library; if not, see
> + <http://www.gnu.org/licenses/>. */
> +
> +#include <assert.h>
> +#include <atomic.h>
> +#include <errno.h>
> +#include <lowlevellock.h>
> +#include <sys/time.h>
> +
> +int
> +__lll_timedwait_tid (int *tidp, const struct timespec *abstime)
> +{
> + /* Reject invalid timeouts. */
> + if (__glibc_unlikely (abstime->tv_nsec < 0)
> + || __glibc_unlikely (abstime->tv_nsec >= 1000000000))
> + return EINVAL;
> +
> + /* Repeat until thread terminated. */
> + int tid;
> + while ((tid = atomic_load_relaxed (tidp)) != 0)
See below.
> + {
> + /* See exit-thread.h for details. */
> + if (tid == NACL_EXITING_TID)
> + /* The thread should now be in the process of exiting, so it will
> + finish quick enough that the timeout doesn't matter. If any
> + thread ever stays in this state for long, there is something
> + catastrophically wrong. */
> + BUSY_WAIT_NOP;
> + else
> + {
> + assert (tid > 0);
> +
> + /* If *FUTEX == TID, wait until woken or timeout. */
> + int err = __nacl_irt_futex.futex_wait_abs ((volatile int *) tidp,
> + tid, abstime);
> + if (err != 0)
> + {
> + if (__glibc_likely (err == ETIMEDOUT))
> + return err;
> + assert (err == EAGAIN);
> + }
> + }
> + }
> +
> + return 0;
> +}
> diff --git a/sysdeps/nacl/lowlevellock.h b/sysdeps/nacl/lowlevellock.h
> new file mode 100644
> index 0000000..0b85d8d
> --- /dev/null
> +++ b/sysdeps/nacl/lowlevellock.h
> @@ -0,0 +1,45 @@
> +/* Low-level lock implementation. NaCl version.
> + Copyright (C) 2015 Free Software Foundation, Inc.
> + This file is part of the GNU C Library.
> +
> + The GNU C Library is free software; you can redistribute it and/or
> + modify it under the terms of the GNU Lesser General Public
> + License as published by the Free Software Foundation; either
> + version 2.1 of the License, or (at your option) any later version.
> +
> + The GNU C Library is distributed in the hope that it will be useful,
> + but WITHOUT ANY WARRANTY; without even the implied warranty of
> + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
> + Lesser General Public License for more details.
> +
> + You should have received a copy of the GNU Lesser General Public
> + License along with the GNU C Library. If not, see
> + <http://www.gnu.org/licenses/>. */
> +
> +#ifndef _LOWLEVELLOCK_H
> +
> +/* Everything except the exit handling is the same as the generic code. */
> +# include <sysdeps/nptl/lowlevellock.h>
> +
> +# ifndef BUSY_WAIT_NOP
> +# define BUSY_WAIT_NOP __sync_synchronize ()
> +# endif
Are you sure that adding a barrier is the right thing to do here
(especially one of the __sync variety that we want to get rid off
eventually)? Technically, you don't need the barrier. Other archs use
"pause" code sequences.
> +/* See exit-thread.h for details. */
> +# define NACL_EXITING_TID 1
> +
> +# undef lll_wait_tid
> +# define lll_wait_tid(tid) \
> + do { \
> + __typeof (tid) __tid; \
> + volatile __typeof (tid) *__tidp = &(tid); \
> + while ((__tid = atomic_load_relaxed (__tidp)) != 0) \
I'm aware we use a relaxed load in the generic version as well but I'm
wondering why that is actually sufficient and we don't need an acquire
load. pthread_join does seem to access values provided by the finished
thread (eg, load pd->result).
I vaguely remember thinking about that issue when we changed the
documentation of lll_wait_tid -- but I can't find any notes or emails
about it.
Do you have a reasoning why it's sufficient, or is this something we
need to dig up / come up with?
> + { \
> + if (__tid == NACL_EXITING_TID) \
> + BUSY_WAIT_NOP; \
> + else \
> + lll_futex_wait (__tidp, __tid, LLL_PRIVATE); \
> + } \
> + } while (0)
> +
> +#endif /* lowlevellock.h */