This is the mail archive of the
libc-alpha@sourceware.org
mailing list for the glibc project.
Re: [PATCH v3 00/18] Improve generic string routines
- From: Ondřej Bílka <neleai at seznam dot cz>
- To: Adhemerval Zanella <adhemerval dot zanella at linaro dot org>
- Cc: libc-alpha at sourceware dot org
- Date: Wed, 10 Jan 2018 23:30:20 +0100
- Subject: Re: [PATCH v3 00/18] Improve generic string routines
- Authentication-results: sourceware.org; auth=none
- References: <1515588482-15744-1-git-send-email-adhemerval.zanella@linaro.org>
On Wed, Jan 10, 2018 at 10:47:44AM -0200, Adhemerval Zanella wrote:
> It is an update of previous Richard's patchset [1] to provide generic
> string implementation for newer ports and make them only focus on
> just specific routines to get a better overall improvement.
> It is done by:
>
This is basically poorly reinvented version of my patch to make generic
string routines. By dusting it of one could get lot better performance
than this.
This contains lot of glaring issues, main ones are:
strnlen - quite ineffective as testing vs zero is faster than generic c.
Could be trivially improved by inlining memchr.
strcmp - this
+ /* Handle the unaligned bytes of p1 first. */
+ n = -(uintptr_t)p1 % sizeof(op_t);
+ for (i = 0; i < n; ++i)
is a the worst way how to write memcmp as you make loop with unpredictable
branch about alignment. Also in strcmp trying to be clever usually
causes performance regression because its likely that you get difference
in starting bytes.
strcpy - this was previously changed to strlen+memcpy because it with
platform specific strlen/memcpy it is faster. One should at least check
if some platforms are affected. I wouldn't be surprised that one could
get faster function just by adding some unrolling to memcpy.