This is the mail archive of the
libc-alpha@sourceware.org
mailing list for the glibc project.
Re: [PING][PATCH v3] PowerPC: libc single-thread lock optimization
- From: "Tulio Magno Quites Machado Filho" <tuliom at linux dot vnet dot ibm dot com>
- To: libc-alpha at sourceware dot org
- Cc: adhemerval dot zanella at linaro dot org, munroesj at linux dot vnet dot ibm dot com
- Cc:
- Date: Mon, 28 Mar 2016 14:36:14 -0300
- Subject: Re: [PING][PATCH v3] PowerPC: libc single-thread lock optimization
- Authentication-results: sourceware.org; auth=none
- References: <540080DF dot 6030205 at linux dot vnet dot ibm dot com> <1457721337-30897-1-git-send-email-tuliom at linux dot vnet dot ibm dot com>
Ping!
Tulio Magno Quites Machado Filho <tuliom@linux.vnet.ibm.com> writes:
> I continued the work started by Adhemerval. The discussion around version 2
> of this patch is available at http://patchwork.sourceware.org/patch/2516/
>
> Nowadays, we already require GCC 4.7, so we can safely rely on compiler
> built-ins for most of our atomic primitives.
>
> Changes since v2:
> - Updated ChangeLog and commit message.
> - Replaced the following atomic primitives by compiler built-ins:
> exchange*, and* and or*.
>
> ---8<---
>
> Add relaxed atomics as a lock optimization. Addressing the concerns
> raised in previous discussions, the primitives are still signal-safe
> (although not thread-safe), so if future implementations relying on
> this code (e.g. malloc) is changed to be async-safe, it won't require to
> adjust powerpc atomics.
>
> For catomic_and and catomic_or I follow the definition at 'include/atomic.h'
> (which powerpc is currently using) and implemented the atomics with acquire
> semantics. The new implementation is based on compiler built-ins.
>
> On synthetic benchmarks it shows an improvement of 5-10% for malloc
> calls and a performance increase of 7-8% in 483.xalancbmk from
> speccpu2006 (number from a POWER8 machine).
>
> Checked on powerpc32, powerpc64 and powerpc64le.
>
> 2016-03-11 Adhemerval Zanella Netto <azanella@linux.vnet.ibm.com>
> Tulio Magno Quites Machado Filho <tuliom@linux.vnet.ibm.com>
>
> * malloc/malloc.c (malloc_consolidate): Replace 0 by NULL in
> order to match the type of p when calling atomic_exchange_acq().
> * sysdeps/powerpc/atomic-machine.h
> (__arch_atomic_exchange_32_acq): Removed.
> (__arch_atomic_exchange_32_rel): Likewise
> (__arch_compare_and_exchange_val_32_relaxed): New macro: atomic compare
> and exchange with relaxed semantic.
> (atomic_compare_and_exchange_val_relaxed): Likewise.
> (__atomic_is_single_thread): New macro: check if program is
> single-thread.
> (atomic_compare_and_exchange_val_acq): Add relaxed operation for
> single-thread.
> (atomic_compare_and_exchange_val_rel): Likewise.
> (atomic_exchange_acq): Likewise.
> (atomic_exchange_rel): Likewise.
> (catomic_and): Add relaxed operation and use compiler built-ins.
> (catomic_or): Likewise.
> (atomic_exchange_acq): Modify to use compiler built-ins.
> (atomic_exchange_rel): Likewise.
> * sysdeps/powerpc/powerpc32/atomic-machine.h
> (__arch_compare_and_exchange_val_64_relaxed): New macro: add empty
> implementation.
> (__arch_atomic_exchange_64_relaxed): Likewise.
> * sysdeps/powerpc/powerpc64/atomic-machine.h
> (__arch_compare_and_exchange_val_64_relaxed): New macro: atomic compare
> and exchange with relaxed semantics.
> (__arch_atomic_exchange_64_acq): Removed.
> (__arch_atomic_exchange_64_rel): Removed.
--
Tulio Magno