This is the mail archive of the
libc-alpha@sourceware.org
mailing list for the glibc project.
Re: [PATCH] Power7 optimization for strncpy and stpncpy.
- From: Adhemerval Zanella <azanella at linux dot vnet dot ibm dot com>
- To: libc-alpha at sourceware dot org
- Date: Mon, 07 Apr 2014 09:11:05 -0300
- Subject: Re: [PATCH] Power7 optimization for strncpy and stpncpy.
- Authentication-results: sourceware.org; auth=none
- References: <1395989036-19189-1-git-send-email-vidya at linux dot vnet dot ibm dot com>
Hi Vidya,
Patch looks good in general, just some comments below:
On 28-03-2014 03:43, vidya@linux.vnet.ibm.com wrote:
> From: Vidya Ranganathan <vidya@linux.vnet.ibm.com>
>
> The optimization is achieved by following techniques:
> > data alignment [gain from aligned memory access on read/write]
> > prefetch data [gain from cache misses by anticipating load]
> > POWER7 gains performance with loop unrolling/unwinding
> [gain by reduction of branch penalty].
>
> ChangeLog:
> 2014-03-27 Vidya Ranganathan <vidya@linux.vnet.ibm.com>
>
> * sysdeps/powerpc/powerpc64/power7/strncpy.S: New file: Optimization.
> * sysdeps/powerpc/powerpc64/multiarch/strncpy.c: New file:
> multiarch strncpy for PPC64.
> * sysdeps/powerpc/powerpc64/multiarch/strncpy-ppc64.c: New file
> * sysdeps/powerpc/powerpc64/multiarch/strncpy-power7.S: New file
> * sysdeps/powerpc/powerpc64/multiarch/ifunc-impl-list.c:
> (__libc_ifunc_impl_list): Likewise.
I think a better wording would be move it below 'sysdeps/powerpc/powerpc64/multiarch/Makefile'.
> diff --git a/sysdeps/powerpc/powerpc64/power7/strncpy.S b/sysdeps/powerpc/powerpc64/power7/strncpy.S
> new file mode 100644
> index 0000000..729401a
> --- /dev/null
> +++ b/sysdeps/powerpc/powerpc64/power7/strncpy.S
> @@ -0,0 +1,483 @@
> +/* Copyright (C) 2014 Free Software Foundation, Inc.
> + This file is part of the GNU C Library.
> +
> + The GNU C Library is free software; you can redistribute it and/or
> + modify it under the terms of the GNU Lesser General Public
> + License as published by the Free Software Foundation; either
> + version 2.1 of the License, or (at your option) any later version.
> +
> + The GNU C Library is distributed in the hope that it will be useful,
> + but WITHOUT ANY WARRANTY; without even the implied warranty of
> + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
> + Lesser General Public License for more details.
> +
> + You should have received a copy of the GNU Lesser General Public
> + License along with the GNU C Library; if not, see
> + <http://www.gnu.org/licenses/>. */
> +
> +#include <sysdep.h>
> +
> +/* Implements the functions
> +
> + char * [r3] strncpy (char *dst [r3], const char *src [r4], size_t n [r5])
> +
> + AND
> +
> + char * [r3] stpncpy (char *dst [r3], const char *src [r4], size_t n [r5])
> +
> + The algorithm is as follows:
> + > if src and dest are 8 byte aligned, perform double word copy
> + else
> + > if src and dest are 4 byte aligned, perform word copy
> + else
> + > copy byte by byte on unaligned addresses.
> +
> + The aligned comparison are made using cmpb instructions. */
> +
> +/* The focus on optimization for performance improvements are as follows:
> + 1. data alignment [gain from aligned memory access on read/write]
> + 2. prefetch data [gain from cache misses by anticipating load]
> + 3. POWER7 gains performance with loop unrolling/unwinding
> + [gain by reduction of branch penalty]. */
> +
> +#ifdef USE_AS_STPNCPY
> + #ifndef STPNCPY
> + # ifdef weak_alias
> + # define STPNCPY __stpncpy
> + weak_alias (__stpncpy, stpncpy)
> + # else
> + # define STPNCPY stpncpy
> + # endif
> + #endif
> + # define FUNC_NAME __stpncpy
> +#else
> + #undef strncpy
> +
> + #ifndef STRNCPY
> + #define STRNCPY strncpy
> + #endif
> + # define FUNC_NAME strncpy
> +#endif
The ifdefs are not aligned to GLIBC standard and it could be simplified by just:
#ifdef USE_AS_STPNCPY
# define FUNC_NAME __stpncpy
#else
# define define FUNC_NAME strncpy
#endif
And the remove the lines:
#else
libc_hidden_def (__stpncpy)
>
> +END(FUNC_NAME)
> +#ifndef USE_AS_STPNCPY
> +libc_hidden_builtin_def (strncpy)
> +#else
> +libc_hidden_def (__stpncpy)
> +#endif
See comment above