This is the mail archive of the
libc-alpha@sourceware.org
mailing list for the glibc project.
Re: [PATCH] Memory fencing problem in pthread cancellation
- From: Steven Munroe <munroesj at linux dot vnet dot ibm dot com>
- To: Jeff Law <law at redhat dot com>
- Cc: libc-alpha <libc-alpha at sourceware dot org>
- Date: Mon, 14 Jan 2013 16:38:46 -0600
- Subject: Re: [PATCH] Memory fencing problem in pthread cancellation
- References: <50F46969.1000305@redhat.com>
- Reply-to: munroesj at us dot ibm dot com
On Mon, 2013-01-14 at 13:24 -0700, Jeff Law wrote:
> Let's consider these two hunks of code in unwind-forcedunwind.c
>
> void
> __attribute_noinline__
> pthread_cancel_init (void)
> {
> void *resume;
> void *personality;
> void *forcedunwind;
> void *getcfa;
> void *handle;
>
> if (__builtin_expect (libgcc_s_handle != NULL, 1))
> {
> /* Force gcc to reload all values. */
> asm volatile ("" ::: "memory");
> return;
> }
> [ ... ]
>
> PTR_MANGLE (resume);
> libgcc_s_resume = resume;
> PTR_MANGLE (personality);
> libgcc_s_personality = personality;
> PTR_MANGLE (forcedunwind);
> libgcc_s_forcedunwind = forcedunwind;
> PTR_MANGLE (getcfa);
> libgcc_s_getcfa = getcfa;
> /* Make sure libgcc_s_handle is written last. Otherwise,
> pthread_cancel_init might return early even when the pointer the
> caller is interested in is not initialized yet. */
> atomic_write_barrier ();
> libgcc_s_handle = handle;
> }
>
> void
> _Unwind_Resume (struct _Unwind_Exception *exc)
> {
> if (__builtin_expect (libgcc_s_handle == NULL, 0))
> pthread_cancel_init ();
> else
> atomic_read_barrier ();
>
> void (*resume) (struct _Unwind_Exception *exc) = libgcc_s_resume;
> PTR_DEMANGLE (resume);
> resume (exc);
> }
>
>
> If we're in Unwind_Resume and libgcc_s_handle is null, we call
> pthread_cancel_init. pthread_cancel_init first checks if
> libgcc_s_handle is non-null. If so, then pthread_cancel_init exits
> early. Upon returning to Unwind_Resume we'll proceed to load & demangle
> libgcc_s_resume and call it.
>
> Note carefully in this case we never executed an atomic_read_barrier
> between the load of libgcc_s_handle and libgcc_s_resume loads. On a
> weakly ordered target such as power7, the processor might speculate the
> libgcc_s_resume load to a points prior to loading libgcc_s_handle.
>
> So if another thread happens to update libgcc_s_handle between the loads
> & checks of libgcc_s_handle in Unwind_Resume and pthread_cancel_init,
> then we can end up using stale data for libgcc_s_resume and blow up.
>
> This has been observed on a 32 processor power7 machine running 16
> instances of the attached testcase in parallel after a period of several
> hours.
>
> There's a couple ways to fix this. One could be to eliminate the early
> return from pthread_cancel_init. Unfortunately there's a call to
> pthread_cancel_init from pthread_cancel. So every one of those calls
> would suffer a performance penalty unless libgcc_s_handle as exported
> from unwind-forcedunwind.c
>
> Another is to add a call to atomic_read_barrier just before the early
> return from pthread_cancel_init. That's precisely what this patch does.
>
>
> While investigating, Carlos identified that the IA64 and ARM ports have
> their own unwind-forcedunwind.c implementations and that they were buggy
> in regards to fencing as well. Both fail to test libgcc_s_handle and do
> not have the necessary calls to atomic_read_barrier. This patch fixes
> those issues as well.
>
> While I have extensively tested the generic unwind-forcedunwind.c change
> backported to a glibc-2.12 base, I have not checked the ARM or IA64 bits
> in any way.
>
>
> plain text document attachment (patch)
> diff --git a/nptl/ChangeLog b/nptl/ChangeLog
> index 4aacc17..90d9408 100644
> --- a/nptl/ChangeLog
> +++ b/nptl/ChangeLog
> @@ -1,3 +1,8 @@
> +2013-01-14 Jeff Law <law@redhat.com>
> +
> + * sysdeps/pthrad/unwind-forcedunwind.c (pthread_cancel_init): Add
> + missing call to atomic_read_barrier in early return path.
> +
> 2013-01-11 Carlos O'Donell <codonell@redhat.com>
>
> * allocatestack.c (allocate_stack): Add comment. Remove assert
> diff --git a/nptl/sysdeps/pthread/unwind-forcedunwind.c b/nptl/sysdeps/pthread/unwind-forcedunwind.c
> index 9718606..6b72e3e 100644
> --- a/nptl/sysdeps/pthread/unwind-forcedunwind.c
> +++ b/nptl/sysdeps/pthread/unwind-forcedunwind.c
> @@ -46,6 +46,10 @@ pthread_cancel_init (void)
> {
> /* Force gcc to reload all values. */
> asm volatile ("" ::: "memory");
> + /* Order reads so as to prevent speculation of loads
> + of libgcc_s_{resume,personality,forcedunwind,getcfa}
> + to points prior to loading libgcc_s_handle. */
> + atomic_read_barrier ();
> return;
> }
>
With the added atomic_read_barrier the "memory" fence is redundant as
the all barrier's should (and do) include the "memory" ref.