This is the mail archive of the
libc-alpha@sourceware.org
mailing list for the glibc project.
[PATCH v3 00/18] Improve generic string routines
- From: Adhemerval Zanella <adhemerval dot zanella at linaro dot org>
- To: libc-alpha at sourceware dot org
- Date: Wed, 10 Jan 2018 10:47:44 -0200
- Subject: [PATCH v3 00/18] Improve generic string routines
- Authentication-results: sourceware.org; auth=none
It is an update of previous Richard's patchset [1] to provide generic
string implementation for newer ports and make them only focus on
just specific routines to get a better overall improvement.
It is done by:
1. parametrizing the internal routines (for instance the find zero
in a word) so each architecture can reimplement without the need
to reimplement the whole routine.
2. vectorizing more string implementations (for instance strcpy
and strcmp).
3. Change some implementations to use already possible optimized
ones (for instance strnlen). It makes new ports to focus on
only provide optimized implementation of a hardful symbols
(for instance memchr) and make its improvement to be used in
a larger set of routines.
For the rest of #5806 I think we can handle them later and if
performance of generic implementation is closer I think it is better
to just remove old assembly implementations.
I also checked on x86_64-linux-gnu, i686-linux-gnu, sparc64-linux-gnu,
and sparcv9-linux-gnu by removing the arch-specific assembly
implementation and disabling multiarch (it covers both LE and BE
for 64 and 32 bits). I also checked the string routines on alpha, hppa,
and sh.
Changes since v2:
* Move string-fz{a,b,i} to its own patch.
* Add a inline implementation for __builtin_c{l,t}z to avoid using
compiler provided symbols.
* Add a new header, string-maskoff.h, to handle unaligned accesses
on some implementation.
* Fixed strcmp on LE machines.
* Added a unaligned strcpy variant for architecture that define
_STRING_ARCH_unaligned.
* Add SH string-fzb.h (which uses cmp/str instruction to find
a zero in word).
Changes since v1:
* Marked ChangeLog entries with [BZ #5806], as appropriate.
* Reorganized the headers, so that armv6t2 and power6 need override
as little as possible to use their (integer) zero detection insns.
* Hopefully fixed all of the coding style issues.
* Adjusted the memrchr algorithm as discussed.
* Replaced the #ifdef STRRCHR etc that are used by the multiarch files.
* Tested on i386, i686, x86_64 (verified this is unused), ppc64,
ppc64le --with-cpu=power8 (to use power6 in multiarch), armv7,
aarch64, alpha (qemu) and hppa (qemu).
PS: This patchset is aimed for 2.28.
[1] https://sourceware.org/ml/libc-alpha/2016-12/msg00830.html
Adhemerval Zanella (5):
Add string-maskoff.h generic header
Add string vectorized find and detection functions
string: Improve generic strnlen
string: Improve generic strcpy
sh: Add string-fzb.h
Richard Henderson (13):
Parameterize op_t from memcopy.h
Parameterize OP_T_THRES from memcopy.h
string: Improve generic strlen
string: Improve generic memchr
string: Improve generic memrchr
string: Improve generic strchr
string: Improve generic strchrnul
string: Improve generic strcmp
hppa: Add memcopy.h
hppa: Add string-fzb.h and string-fzi.h
alpha: Add string-fzb.h and string-fzi.h
arm: Add string-fza.h
powerpc: Add string-fza.h
config.h.in | 8 +
configure | 54 ++++++
configure.ac | 34 ++++
string/memchr.c | 157 ++++-----------
string/memcmp.c | 4 -
string/memrchr.c | 193 ++++--------------
string/strchr.c | 166 +++-------------
string/strchrnul.c | 146 +++-----------
string/strcmp.c | 97 +++++++++-
string/strcpy.c | 109 ++++++++++-
string/strlen.c | 83 ++------
string/strnlen.c | 139 +------------
string/test-strcpy.c | 24 ++-
sysdeps/alpha/string-fzb.h | 51 +++++
sysdeps/alpha/string-fzi.h | 113 +++++++++++
sysdeps/arm/armv6t2/string-fza.h | 69 +++++++
sysdeps/generic/memcopy.h | 11 +-
sysdeps/generic/string-extbyte.h | 35 ++++
sysdeps/generic/string-fza.h | 117 +++++++++++
sysdeps/generic/string-fzb.h | 49 +++++
sysdeps/generic/string-fzi.h | 215 +++++++++++++++++++++
sysdeps/generic/string-maskoff.h | 64 ++++++
sysdeps/generic/string-opthr.h | 25 +++
sysdeps/generic/string-optype.h | 31 +++
sysdeps/hppa/memcopy.h | 44 +++++
sysdeps/hppa/string-fzb.h | 69 +++++++
sysdeps/hppa/string-fzi.h | 135 +++++++++++++
sysdeps/i386/i686/multiarch/memrchr-c.c | 2 +
sysdeps/i386/i686/multiarch/strnlen-c.c | 19 +-
sysdeps/i386/memcopy.h | 3 -
sysdeps/i386/string-opthr.h | 25 +++
sysdeps/m68k/memcopy.h | 3 -
sysdeps/powerpc/power6/string-fza.h | 65 +++++++
sysdeps/powerpc/powerpc32/power4/memcopy.h | 5 -
.../powerpc32/power4/multiarch/strnlen-ppc32.c | 19 +-
sysdeps/powerpc/powerpc32/power6/string-fza.h | 1 +
sysdeps/powerpc/powerpc64/power6/string-fza.h | 1 +
sysdeps/s390/multiarch/memrchr-c.c | 2 +
sysdeps/s390/multiarch/strchr-c.c | 1 +
sysdeps/s390/multiarch/strnlen-c.c | 18 +-
sysdeps/sh/string-fzb.h | 53 +++++
sysdeps/tile/memcmp.c | 1 -
sysdeps/tile/memcopy.h | 7 -
sysdeps/tile/tilegx32/gmp-mparam.h | 30 +++
44 files changed, 1707 insertions(+), 790 deletions(-)
create mode 100644 sysdeps/alpha/string-fzb.h
create mode 100644 sysdeps/alpha/string-fzi.h
create mode 100644 sysdeps/arm/armv6t2/string-fza.h
create mode 100644 sysdeps/generic/string-extbyte.h
create mode 100644 sysdeps/generic/string-fza.h
create mode 100644 sysdeps/generic/string-fzb.h
create mode 100644 sysdeps/generic/string-fzi.h
create mode 100644 sysdeps/generic/string-maskoff.h
create mode 100644 sysdeps/generic/string-opthr.h
create mode 100644 sysdeps/generic/string-optype.h
create mode 100644 sysdeps/hppa/memcopy.h
create mode 100644 sysdeps/hppa/string-fzb.h
create mode 100644 sysdeps/hppa/string-fzi.h
create mode 100644 sysdeps/i386/string-opthr.h
create mode 100644 sysdeps/powerpc/power6/string-fza.h
create mode 100644 sysdeps/powerpc/powerpc32/power6/string-fza.h
create mode 100644 sysdeps/powerpc/powerpc64/power6/string-fza.h
create mode 100644 sysdeps/sh/string-fzb.h
create mode 100644 sysdeps/tile/tilegx32/gmp-mparam.h
--
2.7.4