Bug 5806 - wrong comment in strlen() and other functions
Summary: wrong comment in strlen() and other functions
Status: NEW
Alias: None
Product: glibc
Classification: Unclassified
Component: string (show other bugs)
Version: unspecified
: P3 minor
Target Milestone: 2.38
Assignee: Not yet assigned to anyone
URL:
Keywords:
Depends on: 5807
Blocks:
  Show dependency treegraph
 
Reported: 2008-02-29 00:37 UTC by Egmont Koblinger
Modified: 2023-07-30 12:56 UTC (History)
5 users (show)

See Also:
Host:
Target:
Build:
Last reconfirmed: 2012-05-06 00:00:00
fweimer: security-


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Egmont Koblinger 2008-02-29 00:37:02 UTC
strlen() and similar functions use some cool magic to determine whether any
bytes of an integer is zero. This magic is explained in a long comment in all
these source files. Part of this comment is:

         1) Is this safe?  Will it catch all the zero bytes?
         Suppose there is a byte with all zeros.  Any carry bits
         propagating from its left will fall into the hole at its
         least significant bit and stop. [...]

"propagating from its left" is wrong, it should be "propagating from its right".

In glibc-2.7 there are 14 files that contain this typo. Luckily the wording and
the formatting of the paragraph is exactly the same everywhere. I don't send a
patch because that might easily get outdated or miss some newly added files.
Rather, please do a combo of grep and sed or whatever similar tools to fix these.
Comment 1 Carlos O'Donell 2008-03-02 03:33:30 UTC
The comment doesn't match what the code is doing. The comments should all be
removed when BZ #5807 is resolved.
Comment 2 Egmont Koblinger 2008-03-02 08:50:59 UTC
Well, the comment does match to what the code is doing now, not in strlen.c but
all the other similar files wherever this comment appears (such as strchr.c just
to mention one.)

No need to fix the comments of course, provided that you change the
implementation in all these fourteen files, not just in strlen.c; and throw out
the old version, not just put inside #if 0. If you keep the old version around,
I recommend to fix the comments because it can be done very easily, and might be
a help to anyone trying to understand that.

The algorithm you linked from bug #5807 is definitely nicer than the current
one, easier to understand, and does exact match with no false positive.
Comment 3 Carlos O'Donell 2008-03-02 15:11:37 UTC
Agreed, all fourteen files should be changed to use a sensible algorithm, and
the old comments should be removed. This issue still depends on resolving #5807
first.
Comment 4 Andreas Jaeger 2012-05-06 18:52:44 UTC
This is still in current git.
Comment 5 Roland McGrath 2013-06-11 18:41:11 UTC
Code comments are not an issue with the manual.
Comment 6 Sourceware Commits 2018-01-03 15:35:26 UTC
This is an automated email from the git hooks/post-receive script. It was
generated because a ref change was pushed to the repository containing
the project "GNU C Library master sources".

The branch, azanella/generic-strings has been created
        at  72aa7602bb7fc7e54aaf3f1f49a18122676e138b (commit)

- Log -----------------------------------------------------------------
https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=72aa7602bb7fc7e54aaf3f1f49a18122676e138b

commit 72aa7602bb7fc7e54aaf3f1f49a18122676e138b
Author: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Date:   Tue Feb 21 17:14:16 2017 -0300

    sh: Add string-fzb.h and string-fzi.h
    
    Use the SH cmp/str on has_{zero,eq,zero_eq} and avoid use builtin
    count leading/trailing zero which for SH calls a libgcc function
    (expanding it to direct byte testing is better than a function call).
    
    	* sysdeps/sh/string-fzb.h: New file.
    	* sysdeps/sh/string-fzi.h: Likewise.

https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=0ce10e49871b2d759f6115bb1355883c31bd5959

commit 0ce10e49871b2d759f6115bb1355883c31bd5959
Author: Richard Henderson <rth@twiddle.net>
Date:   Thu Feb 16 16:26:18 2017 -0200

    powerpc: Add string-fza.h
    
    While ppc has the more important string functions in assembly,
    there are still a few generic routines used.
    
    Use the Power 6 CMPB insn for testing of zeros.
    
    	* sysdeps/powerpc/power6/string-fza.h: New file.
    	* sysdeps/powerpc/powerpc32/power6/string-fza.h: New file.
    	* sysdeps/powerpc/powerpc64/power6/string-fza.h: New file.

https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=55ddf7c70da6a8ae23507c41a34d00a127bc1308

commit 55ddf7c70da6a8ae23507c41a34d00a127bc1308
Author: Richard Henderson <rth@twiddle.net>
Date:   Thu Feb 16 16:24:23 2017 -0200

    arm: Add string-fza.h
    
    While arm has the more important string functions in assembly,
    there are still a few generic routines used.
    
    Use the UQSUB8 insn for testing of zeros.
    
    	* sysdeps/arm/armv6t2/string-fza.h: New file.

https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=bef385336624e51985182dc9401c702fdfc73817

commit bef385336624e51985182dc9401c702fdfc73817
Author: Richard Henderson <rth@twiddle.net>
Date:   Thu Feb 16 16:23:27 2017 -0200

    alpha: Add string-fzb.h and string-fzi.h
    
    While alpha has the more important string functions in assembly,
    there are still a few for find the generic routines are used.
    
    Use the CMPBGE insn, via the builtin, for testing of zeros.  Use a
    simplified expansion of __builtin_ctz when the insn isn't available.
    
    	* sysdeps/alpha/string-fza.h: New file.
    	* sysdeps/alpha/string-fzb.h: New file.
    	* sysdeps/alpha/string-fzi.h: New file.

https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=1da9832f154b026451369da54a7860266e691c95

commit 1da9832f154b026451369da54a7860266e691c95
Author: Richard Henderson <rth@twiddle.net>
Date:   Thu Feb 16 16:22:39 2017 -0200

    hppa: Add string-fzb.h and string-fzi.h
    
    Use UXOR,SBZ to test for a zero byte within a word.  While we can
    get semi-decent code out of asm-goto, we would do slightly better
    with a compiler builtin.
    
    For index_zero et al, sequential testing of bytes is less expensive than
    any tricks that involve a count-leading-zeros insn that we don't have.
    
    	* sysdeps/hppa/string-fza.h: New file.
    	* sysdeps/hppa/string-fzb.h: New file.
    	* sysdeps/hppa/string-fzi.h: New file.

https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=6396dd6f4ead1ae41aa2e07103f0a68001f3e208

commit 6396dd6f4ead1ae41aa2e07103f0a68001f3e208
Author: Richard Henderson <rth@twiddle.net>
Date:   Thu Feb 16 16:22:02 2017 -0200

    hppa: Add memcopy.h
    
    GCC's combine pass cannot merge (x >> c | y << (32 - c)) into a
    double-word shift unless (1) the subtract is in the same basic block
    and (2) the result of the subtract is used exactly once.  Neither
    condition is true for any use of MERGE.
    
    By forcing the use of a double-word shift, we not only reduce
    contention on SAR, but also allow the setting of SAR to be hoisted
    outside of a loop.
    
    	* sysdeps/hppa/memcopy.h: New file.

https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=903270e20262487e0dbf1a12d36db172787ef2da

commit 903270e20262487e0dbf1a12d36db172787ef2da
Author: Adhemerval Zanella <adhemerval.zanella@linaro.com>
Date:   Wed Mar 8 16:56:17 2017 +0100

    Improve generic strcpy
    
    New generic implementation tries to use word operations along with
    the new string-fz{b,i} functions even for inputs with different
    alignments (with still uses aligned access plus merge operation
    to get a correct word by word comparison).
    
    	* string/strcpy.c: Rewrite using memcopy.h, string-fzb.h,
            string-fzi.h.

https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=d8c455b1a050a8d9806b568b1d064fa46e12f634

commit d8c455b1a050a8d9806b568b1d064fa46e12f634
Author: Richard Henderson <rth@twiddle.net>
Date:   Thu Feb 16 16:21:26 2017 -0200

    Improve generic strcmp
    
    New generic implementation tries to use word operations along with
    the new string-fz{b,i} functions even for inputs with different
    alignments (with still uses aligned access plus merge operation
    to get a correct word by word comparison).
    
    	* string/strcmp.c: Rewrite using memcopy.h, string-fzb.h,
    	string-fzi.h.

https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=2dba7f25afce8fe2f38ce333bfd6506105b89633

commit 2dba7f25afce8fe2f38ce333bfd6506105b89633
Author: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Date:   Thu Feb 16 16:21:03 2017 -0200

    Improve generic strnlen
    
    With an optimized memchr, new strnlen implementation basically calls
    memchr and adjust the result pointer value.
    
    	[BZ #5806]
    	* string/strnlen.c: Rewrite in terms of memchr.

https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=445306b7ebb5b544509313edde654315fc34c8a3

commit 445306b7ebb5b544509313edde654315fc34c8a3
Author: Richard Henderson <rth@twiddle.net>
Date:   Thu Feb 16 16:20:35 2017 -0200

    Improve generic memrchr
    
    New algorithm have the following key differences:
    
      - Use string-fz{b,i} functions.
    
    	[BZ #5806]
    	* string/memrchr.c: Use string-fzb.h, string-fzi.h.

https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=231a8739e99c60c6ba0804b22e18093c402d0b03

commit 231a8739e99c60c6ba0804b22e18093c402d0b03
Author: Richard Henderson <rth@twiddle.net>
Date:   Thu Feb 16 16:19:40 2017 -0200

    Improve generic strchrnul
    
    New algorithm have the following key differences:
    
      - Reads first word unaligned and use string-maskoff function to
        remove unwanted data.  This strategy follow assemble optimized
        ones for aarch64, powerpc and tile.
    
      - Use string-fz{b,i} functions.
    
    	[BZ #5806]
    	* string/strchrnul.c: Use string-fzb.h, string-fzi.h.

https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=7fab7d55da5cbcccc034188a4bec35b8a0522402

commit 7fab7d55da5cbcccc034188a4bec35b8a0522402
Author: Richard Henderson <rth@twiddle.net>
Date:   Thu Feb 16 16:19:12 2017 -0200

    Improve generic memchr
    
    New algorithm have the following key differences:
    
      - Reads first word unaligned and use string-maskoff function to
        remove unwanted data.  This strategy follow assemble optimized
        ones for aarch64, powerpc and tile.
    
      - Use string-fz{b,i} and string-opthr functions.
    
    	[BZ #5806]
    	* string/memchr.c: Use string-fzb.h, string-fzi.h, string-opthr.h.

https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=c1ed1c8d1b25ec9cc0d788070682e5db2827c147

commit c1ed1c8d1b25ec9cc0d788070682e5db2827c147
Author: Richard Henderson <rth@twiddle.net>
Date:   Thu Feb 16 16:18:48 2017 -0200

    Improve generic strchr
    
    New algorithm have the following key differences:
    
      - Reads first word unaligned and use string-maskoff function to
        remove unwanted data.  This strategy follow assemble optimized
        ones for aarch64 and powerpc.
    
      - Use string-fz{b,i} and string-extbyte function.
    
    	[BZ #5806]
    	* string/strchr.c: Use string-fzb.h, string-fzi.h, string-extbyte.h.

https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=7f989a408cd9197d2b69fbf81a1408200d7efc40

commit 7f989a408cd9197d2b69fbf81a1408200d7efc40
Author: Richard Henderson <rth@twiddle.net>
Date:   Thu Feb 16 16:18:24 2017 -0200

    Improve generic strlen
    
    New algorithm have the following key differences:
    
      - Reads first word unaligned and use string-maskoff function to
        remove unwanted data.  This strategy follow assemble optimized
        ones for powerpc, sparc, and SH.
    
      - Extract has_zero and index_first_zero tests into headers that
        can be tailored for the architecture.
    
    	[BZ #5806]
        	* sysdeps/generic/string-fza.h: New file.
        	* sysdeps/generic/string-fzb.h: New file.
        	* sysdeps/generic/string-fzi.h: New file.
        	* sysdeps/generic/string-extbyte.h: New file.
        	* string/strlen.c: Use them.

https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=d16d6a7e1fe19a5f252f721a3229c0daf8979a31

commit d16d6a7e1fe19a5f252f721a3229c0daf8979a31
Author: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Date:   Thu Feb 23 18:45:54 2017 -0300

    Add string-maskoff.h generic header
    
    Macros to operate on unaligned access for string operations, such as
    to create a bit mask to remove non wanted bytes from an unaligned
    read, and to repeat byte within a word.
    
    	* sysdeps/generic/string-maskoff.h: New file.

https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=ad1d81b8ba31514579912a7ff52c5405c21b9726

commit ad1d81b8ba31514579912a7ff52c5405c21b9726
Author: Richard Henderson <rth@twiddle.net>
Date:   Thu Feb 16 16:15:27 2017 -0200

    Parameterize OP_T_THRES from memcopy.h
    
    Basically it moves OP_T_THRES out of memcopy.h to its own header
    and adjust each architecture that redefines it.
    
    	* sysdeps/generic/memcopy.h (OP_T_THRES): Move...
    	* sysdeps/generic/string-opthr.h: ... here; new file.
    	* sysdeps/i386/memcopy.h (OP_T_THRES): Move...
    	* sysdeps/i386/string-opthr.h: ... here; new file.
    	* sysdeps/m68k/memcopy.h (OP_T_THRES): Remove.
    	* sysdeps/powerpc/powerpc32/power4/memcopy.h (OP_T_THRES): Remove.

https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=6b9ecce2f78a2bebc2c1c21c0b21e32ccdad8862

commit 6b9ecce2f78a2bebc2c1c21c0b21e32ccdad8862
Author: Richard Henderson <rth@twiddle.net>
Date:   Thu Feb 16 16:14:09 2017 -0200

    Parameterize op_t from memcopy.h
    
    Basically moves op_t definition out to an specific header.
    
    	* sysdeps/generic/string-optype.h: New file.
    	* sysdeps/generic/memcopy.h: Include it.

https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=85893852d17ce5742f128308e55ead1f10e2beb0

commit 85893852d17ce5742f128308e55ead1f10e2beb0
Author: Adhemerval Zanella <adhemerval.zanella@linaro.com>
Date:   Thu Mar 9 15:00:27 2017 +0100

    string: Remove __memchr definition
    
    Since memchr is a C90 function (so there are no external-linkage
    namespace issues), and not used in any macros defined in installed headers
    (so no block-scope namespace issues) it is safe to just remove its
    internal definition and just set all the arch specific implementation
    to just define memchr instead.
    
    Checked on x86_64-linux-gnu and with build-many-glibc.py.
    
    	* string/memchr.c (__memchr): Redefine to memchr.
    	* sysdeps/aarch64/memchr.S (__memchr): Likewise.
    	* sysdeps/aarch64/rawmemchr.S (__memchr): Likewise.
    	* sysdeps/i386/i686/multiarch/memchr.S (__memchr): Likewise.
    	* sysdeps/i386/memchr.S (__memchr): Likewise.
    	* sysdeps/ia64/memchr.S (__memchr): Likewise.
    	* sysdeps/m68k/memchr.S (__memchr): Likewise.
    	* sysdeps/powerpc/powerpc32/power4/multiarch/memchr-ppc32.c
    	 (__memchr): Likewise.
    	* sysdeps/powerpc/powerpc32/power7/memchr.S (__memchr): Likewise.
    	* sysdeps/powerpc/powerpc64/power7/memchr.S (__memchr): Likewise.
    	* sysdeps/sparc/sparc32/memchr.S (__memchr): Likewise.
    	* sysdeps/sparc/sparc64/memchr.S (__memchr): Likewise.
    	* sysdeps/tile/tilegx/memchr.c (__memchr): Likewise.
    	* sysdeps/tile/tilepro/memchr.c (__memchr): Likewise.
    	* sysdeps/x86_64/memchr.S (__memchr): Likewise.

-----------------------------------------------------------------------
Comment 7 Sourceware Commits 2018-01-09 20:44:33 UTC
This is an automated email from the git hooks/post-receive script. It was
generated because a ref change was pushed to the repository containing
the project "GNU C Library master sources".

The branch, azanella/generic-strings has been created
        at  e24962bc9b04c0d43f02f036be079552e26ddc6a (commit)

- Log -----------------------------------------------------------------
https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=e24962bc9b04c0d43f02f036be079552e26ddc6a

commit e24962bc9b04c0d43f02f036be079552e26ddc6a
Author: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Date:   Tue Feb 21 17:14:16 2017 -0300

    sh: Add string-fzb.h and string-fzi.h
    
    Use the SH cmp/str on has_{zero,eq,zero_eq}.
    
    Checked on sh4-linux-gnu.
    
    	Adhemerval Zanella <adhemerval.zanella@linaro.org>
    
    	* sysdeps/sh/string-fzb.h: New file.

https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=127ee46de04935699d5be9b2f8bc2f01ebf13a63

commit 127ee46de04935699d5be9b2f8bc2f01ebf13a63
Author: Richard Henderson <rth@twiddle.net>
Date:   Thu Feb 16 16:26:18 2017 -0200

    powerpc: Add string-fza.h
    
    While ppc has the more important string functions in assembly,
    there are still a few generic routines used.
    
    Use the Power 6 CMPB insn for testing of zeros.
    
    Checked on powerpc64le-linux-gnu.
    
    	Richard Henderson  <rth@twiddle.net>
    
    	* sysdeps/powerpc/power6/string-fza.h: New file.
    	* sysdeps/powerpc/powerpc32/power6/string-fza.h: New file.
    	* sysdeps/powerpc/powerpc64/power6/string-fza.h: New file.

https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=317e73df9e1224fe2ba829c3d0e6ab36858752eb

commit 317e73df9e1224fe2ba829c3d0e6ab36858752eb
Author: Richard Henderson <rth@twiddle.net>
Date:   Thu Feb 16 16:24:23 2017 -0200

    arm: Add string-fza.h
    
    While arm has the more important string functions in assembly,
    there are still a few generic routines used.
    
    Use the UQSUB8 insn for testing of zeros.
    
    Checked on armv7-linux-gnueabihf
    
    	Richard Henderson  <rth@twiddle.net>
    
    	* sysdeps/arm/armv6t2/string-fza.h: New file.

https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=c5d9130b36ceb00fcf22e79cdaa58c3ddade294a

commit c5d9130b36ceb00fcf22e79cdaa58c3ddade294a
Author: Richard Henderson <rth@twiddle.net>
Date:   Thu Feb 16 16:23:27 2017 -0200

    alpha: Add string-fzb.h and string-fzi.h
    
    While alpha has the more important string functions in assembly,
    there are still a few for find the generic routines are used.
    
    Use the CMPBGE insn, via the builtin, for testing of zeros.  Use a
    simplified expansion of __builtin_ctz when the insn isn't available.
    
    Checked on alpha-linux-gnu.
    
    	Richard Henderson  <rth@twiddle.net>
    
    	* sysdeps/alpha/string-fza.h: New file.
    	* sysdeps/alpha/string-fzb.h: New file.
    	* sysdeps/alpha/string-fzi.h: New file.

https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=8fec1f9ca69bb6e6d4952cee029c1a0bcee3b57d

commit 8fec1f9ca69bb6e6d4952cee029c1a0bcee3b57d
Author: Richard Henderson <rth@twiddle.net>
Date:   Thu Feb 16 16:22:39 2017 -0200

    hppa: Add string-fzb.h and string-fzi.h
    
    Use UXOR,SBZ to test for a zero byte within a word.  While we can
    get semi-decent code out of asm-goto, we would do slightly better
    with a compiler builtin.
    
    For index_zero et al, sequential testing of bytes is less expensive than
    any tricks that involve a count-leading-zeros insn that we don't have.
    
    Checked on hppa-linux-gnu.
    
    	Richard Henderson  <rth@twiddle.net>
    
    	* sysdeps/hppa/string-fza.h: New file.
    	* sysdeps/hppa/string-fzb.h: New file.
    	* sysdeps/hppa/string-fzi.h: New file.

https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=109c793ed2bce55328c897853b91ae628aa42c6d

commit 109c793ed2bce55328c897853b91ae628aa42c6d
Author: Richard Henderson <rth@twiddle.net>
Date:   Thu Feb 16 16:22:02 2017 -0200

    hppa: Add memcopy.h
    
    GCC's combine pass cannot merge (x >> c | y << (32 - c)) into a
    double-word shift unless (1) the subtract is in the same basic block
    and (2) the result of the subtract is used exactly once.  Neither
    condition is true for any use of MERGE.
    
    By forcing the use of a double-word shift, we not only reduce
    contention on SAR, but also allow the setting of SAR to be hoisted
    outside of a loop.
    
    Checked on hppa-linux-gnu.
    
    	Richard Henderson  <rth@twiddle.net>
    
    	* sysdeps/hppa/memcopy.h: New file.

https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=e644d7431f23aab32f88a9c37ec58704cb4cc8e5

commit e644d7431f23aab32f88a9c37ec58704cb4cc8e5
Author: Adhemerval Zanella <adhemerval.zanella@linaro.com>
Date:   Wed Mar 8 16:56:17 2017 +0100

    Improve generic strcpy
    
    New generic implementation tries to use word operations along with
    the new string-fz{b,i} functions even for inputs with different
    alignments (with still uses aligned access plus merge operation
    to get a correct word by word comparison).
    
    Checked on x86_64-linux-gnu, i686-linux-gnu, sparc64-linux-gnu,
    and sparcv9-linux-gnu by removing the arch-specific assembly
    implementation and disabling multi-arch (it covers both LE and BE
    for 64 and 32 bits).
    
    	Richard Henderson  <rth@twiddle.net>
    	Adhemerval Zanella  <adhemerval.zanella@linaro.org>
    
    	* string/strcpy.c: Rewrite using memcopy.h, string-fzb.h,
            string-fzi.h.

https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=e0b21c7f76a0d88e519891fe71d2f83b0a4a0d27

commit e0b21c7f76a0d88e519891fe71d2f83b0a4a0d27
Author: Richard Henderson <rth@twiddle.net>
Date:   Thu Feb 16 16:21:26 2017 -0200

    Improve generic strcmp
    
    New generic implementation tries to use word operations along with
    the new string-fz{b,i} functions even for inputs with different
    alignments (with still uses aligned access plus merge operation
    to get a correct word by word comparison).
    
    Checked on x86_64-linux-gnu, i686-linux-gnu, sparc64-linux-gnu,
    and sparcv9-linux-gnu by removing the arch-specific assembly
    implementation and disabling multi-arch (it covers both LE and BE
    for 64 and 32 bits).
    
    	Richard Henderson  <rth@twiddle.net>
    	Adhemerval Zanella  <adhemerval.zanella@linaro.org>
    
    	* string/strcmp.c: Rewrite using memcopy.h, string-fzb.h,
    	string-fzi.h.

https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=7428f9e9c5d643ee84bf061fe641f9e2ecde372c

commit 7428f9e9c5d643ee84bf061fe641f9e2ecde372c
Author: Richard Henderson <rth@twiddle.net>
Date:   Thu Feb 16 16:19:40 2017 -0200

    Improve generic strchrnul
    
    New algorithm have the following key differences:
    
      - Reads first word unaligned and use string-maskoff function to
        remove unwanted data.  This strategy follow assemble optimized
        ones for aarch64, powerpc and tile.
    
      - Use string-fz{b,i} functions.
    
    Checked on x86_64-linux-gnu, i686-linux-gnu, sparc64-linux-gnu,
    and sparcv9-linux-gnu by removing the arch-specific assembly
    implementation and disabling multi-arch (it covers both LE and BE
    for 64 and 32 bits).
    
    	[BZ #5806]
    	* string/strchrnul.c: Use string-fzb.h, string-fzi.h.

https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=2a441052e3b8ef83dd3e520bc1f1037a472b90fd

commit 2a441052e3b8ef83dd3e520bc1f1037a472b90fd
Author: Richard Henderson <rth@twiddle.net>
Date:   Thu Feb 16 16:18:48 2017 -0200

    Improve generic strchr
    
    New algorithm have the following key differences:
    
      - Reads first word unaligned and use string-maskoff function to
        remove unwanted data.  This strategy follow assemble optimized
        ones for aarch64 and powerpc.
    
      - Use string-fz{b,i} and string-extbyte function.
    
    Checked on x86_64-linux-gnu, i686-linux-gnu, sparc64-linux-gnu,
    and sparcv9-linux-gnu by removing the arch-specific assembly
    implementation and disabling multi-arch (it covers both LE and BE
    for 64 and 32 bits).
    
    	[BZ #5806]
    	* string/strchr.c: Use string-fzb.h, string-fzi.h, string-extbyte.h.
    	* sysdeps/s390/multiarch/strchr-c.c: Redefine weak_alias.

https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=96d1ccd369e59c2c741adc81b0a339d90990ae8a

commit 96d1ccd369e59c2c741adc81b0a339d90990ae8a
Author: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Date:   Thu Feb 16 16:21:03 2017 -0200

    Improve generic strnlen
    
    With an optimized memchr, new strnlen implementation basically calls
    memchr and adjust the result pointer value.
    
    It also cleanups the multiple inclusion by leaving the ifunc
    implementation to undef the weak_alias and libc_hidden_def.
    
    	[BZ #5806]
    	* string/strnlen.c: Rewrite in terms of memchr.
    	* sysdeps/i386/i686/multiarch/strnlen-c.c: Redefine weak_alias
    	and libc_hidden_def.
    	* sysdeps/powerpc/powerpc32/power4/multiarch/strnlen-ppc32.c:
    	Likewise.
    	* sysdeps/s390/multiarch/strnlen-c.c: Likewise.

https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=9e7ec585ae87b5f4ac4e98e3eb8821784cd2b87a

commit 9e7ec585ae87b5f4ac4e98e3eb8821784cd2b87a
Author: Richard Henderson <rth@twiddle.net>
Date:   Thu Feb 16 16:20:35 2017 -0200

    Improve generic memrchr
    
    New algorithm have the following key differences:
    
      - Use string-fz{b,i} functions.
    
    It also cleanups the multiple inclusion by leaving the ifunc
    implementation to undef the weak_alias.
    
    Checked on x86_64-linux-gnu, i686-linux-gnu, sparc64-linux-gnu,
    and sparcv9-linux-gnu by removing the arch-specific assembly
    implementation and disabling multi-arch (it covers both LE and BE
    for 64 and 32 bits).
    
    	[BZ #5806]
    	* string/memrchr.c: Use string-fzb.h, string-fzi.h.
    	* sysdeps/i386/i686/multiarch/memrchr-c.c: Redefined weak_alias.
    	* sysdeps/s390/multiarch/memrchr-c.c: Likewise.

https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=3d0770c276c6db572cb550abf6124da52dfa34a9

commit 3d0770c276c6db572cb550abf6124da52dfa34a9
Author: Richard Henderson <rth@twiddle.net>
Date:   Thu Feb 16 16:19:12 2017 -0200

    Improve generic memchr
    
    New algorithm have the following key differences:
    
      - Reads first word unaligned and use string-maskoff function to
        remove unwanted data.  This strategy follow assemble optimized
        ones for aarch64, powerpc and tile.
    
      - Use string-fz{b,i} and string-opthr functions.
    
    Checked on x86_64-linux-gnu, i686-linux-gnu, sparc64-linux-gnu,
    and sparcv9-linux-gnu by removing the arch-specific assembly
    implementation and disabling multi-arch (it covers both LE and BE
    for 64 and 32 bits).
    
    	[BZ #5806]
    	* string/memchr.c: Use string-fzb.h, string-fzi.h, string-opthr.h.

https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=a13293ecf9828d582c77b97255835d075311a1a3

commit a13293ecf9828d582c77b97255835d075311a1a3
Author: Richard Henderson <rth@twiddle.net>
Date:   Thu Feb 16 16:18:24 2017 -0200

    string: Improve generic strlen
    
    New algorithm have the following key differences:
    
      - Reads first word unaligned and use string-maskoff functions to
        remove unwanted data.  This strategy follow assemble optimized
        ones for powerpc, sparc, and SH.
    
      - Use of has_zero and index_first_zero parametrized functions.
    
    Checked on x86_64-linux-gnu, i686-linux-gnu, sparc64-linux-gnu,
    and sparcv9-linux-gnu by removing the arch-specific assembly
    implementation and disabling multi-arch (it covers both LE and BE
    for 64 and 32 bits).
    
    	[BZ #5806]
        	* string/strlen.c: Use them.

https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=a7a46e4ac7a290e16700649fbd596998cea47c19

commit a7a46e4ac7a290e16700649fbd596998cea47c19
Author: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Date:   Mon Jan 8 16:41:43 2018 -0200

    Add string vectorized find and detection headers
    
    This patch adds generic string find and detection implementation meant
    to be used in generic vectorized string implementation.  The idea is to
    decompose the basic string operation so each architecture can reimplement
    if it provides any specialized hardware instruction.
    
    The 'string-fza.h' provides zero byte detection functions (find_zero_low,
    find_zero_all, find_eq_low, find_eq_all, find_zero_eq_low, find_zero_eq_all,
    find_zero_ne_low, and find_zero_ne_all).  They are used on both functions
    provided by 'string-fzb.h' and 'string-fzi'.
    
    The 'string-fzb.h' provides boolean zero byte detection with the
    functions:
    
      - has_zero: determine if any byte within a word is zero.
      - has_eq: determine byte equality between two words.
      - has_zero_eq: determine if any byte within a word is zero along with
        byte equality between two words.
    
    The 'string-fzi.h' provides zero byte detection along with its positions:
    
      - index_first_zero: return index of first zero byte within a word.
      - index_first_eq: return index of first byte different between two words.
      - index_first_zero_eq: return index of first zero byte within a word or
        first byte different between two words.
      - index_first_zero_ne: return index of first zero byte within a word or
        first byte equal between two words.
      - index_last_zero: return index of last zero byte within a word.
      - index_last_eq: return index of last byte different between two words.
    
    Also, to avoid libcalls in the '__builtin_c{t,l}z{l}' calls (which may
    add performance degradation), inline implementation based on De Bruijn
    sequences are added (enabled by a configure check).
    
    	Richard Henderson  <rth@twiddle.net>
    	Adhemerval Zanella  <adhemerval.zanella@linaro.org>
    
    	* config.h.in (HAVE_BUILTIN_CTZ, HAVE_BUILTIN_CLZ): New defines.
    	* configure.ac: Check for __builtin_ctz{l} with no external
    	dependencies
    	* sysdeps/generic/string-extbyte.h: New file.
    	* sysdeps/generic/string-fza.h: Likewise.
    	* sysdeps/generic/string-fzb.h: Likewise.
    	* sysdeps/generic/string-fzi.h: Likewise.

https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=040bb0502a08765e01ba31708719e657a8c113b1

commit 040bb0502a08765e01ba31708719e657a8c113b1
Author: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Date:   Thu Feb 23 18:45:54 2017 -0300

    Add string-maskoff.h generic header
    
    Macros to operate on unaligned access for string operations:
    
      - create_mask: create a mask based on pointer alignment to sets up
        non-zero bytes before the beginning of the word so a following
        operation (such as find zero) might ignore these bytes.
    
      - highbit_mask: create a mask with high bit of each byte being 1,
        and the low 7 bits being all the opposite of the input.
    
    These macros are meant to be used on optimized vectorized string
    implementations.
    
    	Richard Henderson  <rth@twiddle.net>
    	Adhemerval Zanella  <adhemerval.zanella@linaro.org>
    
    	* sysdeps/generic/string-maskoff.h: New file.

https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=69030c746991d2d12d8b2c37ff84d350d6c412ee

commit 69030c746991d2d12d8b2c37ff84d350d6c412ee
Author: Richard Henderson <rth@twiddle.net>
Date:   Thu Feb 16 16:15:27 2017 -0200

    Parameterize OP_T_THRES from memcopy.h
    
    Basically it moves OP_T_THRES out of memcopy.h to its own header
    and adjust each architecture that redefines it.
    
    Checked with a build and check with run-built-tests=no for all major
    Linux ABIs (alpha, aarch64, arm, hppa, i686, ia64, m68k, microblaze,
    mips, mips64, nios2, powerpc, powerpc64le, s390x, sh4, sparc64,
    tilegx, and x86_64).
    
    	Richard Henderson  <rth@twiddle.net>
    	Adhemerval Zanella  <adhemerval.zanella@linaro.org>
    
    	* sysdeps/generic/memcopy.h (OP_T_THRES): Move...
    	* sysdeps/generic/string-opthr.h: ... here; new file.
    	* sysdeps/i386/memcopy.h (OP_T_THRES): Move...
    	* sysdeps/i386/string-opthr.h: ... here; new file.
    	* sysdeps/m68k/memcopy.h (OP_T_THRES): Remove.
    	* string/memcmp.c (OP_T_THRES): Remove definition.
    	* sysdeps/powerpc/powerpc32/power4/memcopy.h (OP_T_THRES): Likewise.

https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=aa7c9fa3a21562f9abdacac629b6fcef64551d6a

commit aa7c9fa3a21562f9abdacac629b6fcef64551d6a
Author: Richard Henderson <rth@twiddle.net>
Date:   Thu Feb 16 16:14:09 2017 -0200

    Parameterize op_t from memcopy.h
    
    Basically moves op_t definition out to an specific header, adds
    the attribute 'may-alias', and cleanup its duplicated definitions.
    It lead to inclusion of tilegx32 gmp-mparam.h similar to x32 so
    op_t can be define as a long long (from _LONG_LONG_LIMB).
    
    Checked with a build and check with run-built-tests=no for all major
    Linux ABIs (alpha, aarch64, arm, hppa, i686, ia64, m68k, microblaze,
    mips, mips64, nios2, powerpc, powerpc64le, s390x, sh4, sparc64,
    tilegx, and x86_64).
    
    	Richard Henderson  <rth@twiddle.net>
    	Adhemerval Zanella  <adhemerval.zanella@linaro.org>
    
    	* sysdeps/generic/string-optype.h: New file.
    	* sysdeps/generic/memcopy.h: Include it.
    	* string/memcmp.c (op_t): Remove define.
    	* sysdeps/tile/memcmp.c (op_t): Likewise.
    	* sysdeps/tile/memcopy.h (op_t): Likewise.
    	* sysdeps/tile/tilegx32/gmp-mparam.h: New file.

-----------------------------------------------------------------------
Comment 8 Sourceware Commits 2018-01-10 12:36:05 UTC
This is an automated email from the git hooks/post-receive script. It was
generated because a ref change was pushed to the repository containing
the project "GNU C Library master sources".

The branch, azanella/generic-strings has been created
        at  52a9cf06655e8b65b51de679c7550c2c1c2d837b (commit)

- Log -----------------------------------------------------------------
https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=52a9cf06655e8b65b51de679c7550c2c1c2d837b

commit 52a9cf06655e8b65b51de679c7550c2c1c2d837b
Author: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Date:   Tue Feb 21 17:14:16 2017 -0300

    sh: Add string-fzb.h
    
    Use the SH cmp/str on has_{zero,eq,zero_eq}.
    
    Checked on sh4-linux-gnu.
    
    	Adhemerval Zanella <adhemerval.zanella@linaro.org>
    
    	* sysdeps/sh/string-fzb.h: New file.

https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=24d01f48967662e8f356d8367658cbb192617667

commit 24d01f48967662e8f356d8367658cbb192617667
Author: Richard Henderson <rth@twiddle.net>
Date:   Thu Feb 16 16:26:18 2017 -0200

    powerpc: Add string-fza.h
    
    While ppc has the more important string functions in assembly,
    there are still a few generic routines used.
    
    Use the Power 6 CMPB insn for testing of zeros.
    
    Checked on powerpc64le-linux-gnu.
    
    	Richard Henderson  <rth@twiddle.net>
    
    	* sysdeps/powerpc/power6/string-fza.h: New file.
    	* sysdeps/powerpc/powerpc32/power6/string-fza.h: New file.
    	* sysdeps/powerpc/powerpc64/power6/string-fza.h: New file.

https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=6b67749f9df8400b8d0a768f78333302d7c5d9f6

commit 6b67749f9df8400b8d0a768f78333302d7c5d9f6
Author: Richard Henderson <rth@twiddle.net>
Date:   Thu Feb 16 16:24:23 2017 -0200

    arm: Add string-fza.h
    
    While arm has the more important string functions in assembly,
    there are still a few generic routines used.
    
    Use the UQSUB8 insn for testing of zeros.
    
    Checked on armv7-linux-gnueabihf
    
    	Richard Henderson  <rth@twiddle.net>
    
    	* sysdeps/arm/armv6t2/string-fza.h: New file.

https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=099a39660c4d50187f30a87a52dc92cfca22df34

commit 099a39660c4d50187f30a87a52dc92cfca22df34
Author: Richard Henderson <rth@twiddle.net>
Date:   Thu Feb 16 16:23:27 2017 -0200

    alpha: Add string-fzb.h and string-fzi.h
    
    While alpha has the more important string functions in assembly,
    there are still a few for find the generic routines are used.
    
    Use the CMPBGE insn, via the builtin, for testing of zeros.  Use a
    simplified expansion of __builtin_ctz when the insn isn't available.
    
    Checked on alpha-linux-gnu.
    
    	Richard Henderson  <rth@twiddle.net>
    
    	* sysdeps/alpha/string-fza.h: New file.
    	* sysdeps/alpha/string-fzb.h: New file.
    	* sysdeps/alpha/string-fzi.h: New file.

https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=b6d266c6fee641c4c9724086dcb9e07c02c5a5bf

commit b6d266c6fee641c4c9724086dcb9e07c02c5a5bf
Author: Richard Henderson <rth@twiddle.net>
Date:   Thu Feb 16 16:22:39 2017 -0200

    hppa: Add string-fzb.h and string-fzi.h
    
    Use UXOR,SBZ to test for a zero byte within a word.  While we can
    get semi-decent code out of asm-goto, we would do slightly better
    with a compiler builtin.
    
    For index_zero et al, sequential testing of bytes is less expensive than
    any tricks that involve a count-leading-zeros insn that we don't have.
    
    Checked on hppa-linux-gnu.
    
    	Richard Henderson  <rth@twiddle.net>
    
    	* sysdeps/hppa/string-fza.h: New file.
    	* sysdeps/hppa/string-fzb.h: New file.
    	* sysdeps/hppa/string-fzi.h: New file.

https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=ebb0353dba2ed38ff88bcf8ea03eb80bb41b3e4f

commit ebb0353dba2ed38ff88bcf8ea03eb80bb41b3e4f
Author: Richard Henderson <rth@twiddle.net>
Date:   Thu Feb 16 16:22:02 2017 -0200

    hppa: Add memcopy.h
    
    GCC's combine pass cannot merge (x >> c | y << (32 - c)) into a
    double-word shift unless (1) the subtract is in the same basic block
    and (2) the result of the subtract is used exactly once.  Neither
    condition is true for any use of MERGE.
    
    By forcing the use of a double-word shift, we not only reduce
    contention on SAR, but also allow the setting of SAR to be hoisted
    outside of a loop.
    
    Checked on hppa-linux-gnu.
    
    	Richard Henderson  <rth@twiddle.net>
    
    	* sysdeps/hppa/memcopy.h: New file.

https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=5c29485b5e15798ef3ede33272b36df521e4cbec

commit 5c29485b5e15798ef3ede33272b36df521e4cbec
Author: Adhemerval Zanella <adhemerval.zanella@linaro.com>
Date:   Wed Mar 8 16:56:17 2017 +0100

    string: Improve generic strcpy
    
    New generic implementation tries to use word operations along with
    the new string-fz{b,i} functions even for inputs with different
    alignments (with still uses aligned access plus merge operation
    to get a correct word by word comparison).
    
    Checked on x86_64-linux-gnu, i686-linux-gnu, sparc64-linux-gnu,
    and sparcv9-linux-gnu by removing the arch-specific assembly
    implementation and disabling multi-arch (it covers both LE and BE
    for 64 and 32 bits).
    
    	Richard Henderson  <rth@twiddle.net>
    	Adhemerval Zanella  <adhemerval.zanella@linaro.org>
    
    	* string/strcpy.c: Rewrite using memcopy.h, string-fzb.h,
            string-fzi.h.

https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=84bfd24e57385ce8e5d37e148443f22661e87490

commit 84bfd24e57385ce8e5d37e148443f22661e87490
Author: Richard Henderson <rth@twiddle.net>
Date:   Thu Feb 16 16:21:26 2017 -0200

    string: Improve generic strcmp
    
    New generic implementation tries to use word operations along with
    the new string-fz{b,i} functions even for inputs with different
    alignments (with still uses aligned access plus merge operation
    to get a correct word by word comparison).
    
    Checked on x86_64-linux-gnu, i686-linux-gnu, sparc64-linux-gnu,
    and sparcv9-linux-gnu by removing the arch-specific assembly
    implementation and disabling multi-arch (it covers both LE and BE
    for 64 and 32 bits).
    
    	Richard Henderson  <rth@twiddle.net>
    	Adhemerval Zanella  <adhemerval.zanella@linaro.org>
    
    	* string/strcmp.c: Rewrite using memcopy.h, string-fzb.h,
    	string-fzi.h.

https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=572c3a397d79ab9efa89d131f72ef2d6b438eb69

commit 572c3a397d79ab9efa89d131f72ef2d6b438eb69
Author: Richard Henderson <rth@twiddle.net>
Date:   Thu Feb 16 16:19:40 2017 -0200

    string: Improve generic strchrnul
    
    New algorithm have the following key differences:
    
      - Reads first word unaligned and use string-maskoff function to
        remove unwanted data.  This strategy follow assemble optimized
        ones for aarch64, powerpc and tile.
    
      - Use string-fz{b,i} functions.
    
    Checked on x86_64-linux-gnu, i686-linux-gnu, sparc64-linux-gnu,
    and sparcv9-linux-gnu by removing the arch-specific assembly
    implementation and disabling multi-arch (it covers both LE and BE
    for 64 and 32 bits).
    
    	[BZ #5806]
    	* string/strchrnul.c: Use string-fzb.h, string-fzi.h.

https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=500a7f1a861e801292bfd70d9e6c4539550f0f56

commit 500a7f1a861e801292bfd70d9e6c4539550f0f56
Author: Richard Henderson <rth@twiddle.net>
Date:   Thu Feb 16 16:18:48 2017 -0200

    string: Improve generic strchr
    
    New algorithm have the following key differences:
    
      - Reads first word unaligned and use string-maskoff function to
        remove unwanted data.  This strategy follow assemble optimized
        ones for aarch64 and powerpc.
    
      - Use string-fz{b,i} and string-extbyte function.
    
    Checked on x86_64-linux-gnu, i686-linux-gnu, sparc64-linux-gnu,
    and sparcv9-linux-gnu by removing the arch-specific assembly
    implementation and disabling multi-arch (it covers both LE and BE
    for 64 and 32 bits).
    
    	Richard Henderson  <rth@twiddle.net>
    	Adhemerval Zanella  <adhemerval.zanella@linaro.org>
    
    	[BZ #5806]
    	* string/strchr.c: Use string-fzb.h, string-fzi.h, string-extbyte.h.
    	* sysdeps/s390/multiarch/strchr-c.c: Redefine weak_alias.

https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=2368dc1dcac592501eb07bc6c9d0fa6f9c43251b

commit 2368dc1dcac592501eb07bc6c9d0fa6f9c43251b
Author: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Date:   Thu Feb 16 16:21:03 2017 -0200

    string: Improve generic strnlen
    
    With an optimized memchr, new strnlen implementation basically calls
    memchr and adjust the result pointer value.
    
    It also cleanups the multiple inclusion by leaving the ifunc
    implementation to undef the weak_alias and libc_hidden_def.
    
    	Richard Henderson  <rth@twiddle.net>
    	Adhemerval Zanella  <adhemerval.zanella@linaro.org>
    
    	[BZ #5806]
    	* string/strnlen.c: Rewrite in terms of memchr.
    	* sysdeps/i386/i686/multiarch/strnlen-c.c: Redefine weak_alias
    	and libc_hidden_def.
    	* sysdeps/powerpc/powerpc32/power4/multiarch/strnlen-ppc32.c:
    	Likewise.
    	* sysdeps/s390/multiarch/strnlen-c.c: Likewise.

https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=8b5ae84584e867a592a6134c5f9165600b8ae7c1

commit 8b5ae84584e867a592a6134c5f9165600b8ae7c1
Author: Richard Henderson <rth@twiddle.net>
Date:   Thu Feb 16 16:20:35 2017 -0200

    string: Improve generic memrchr
    
    New algorithm have the following key differences:
    
      - Use string-fz{b,i} functions.
    
    It also cleanups the multiple inclusion by leaving the ifunc
    implementation to undef the weak_alias.
    
    Checked on x86_64-linux-gnu, i686-linux-gnu, sparc64-linux-gnu,
    and sparcv9-linux-gnu by removing the arch-specific assembly
    implementation and disabling multi-arch (it covers both LE and BE
    for 64 and 32 bits).
    
    	Richard Henderson  <rth@twiddle.net>
    	Adhemerval Zanella  <adhemerval.zanella@linaro.org>
    
    	[BZ #5806]
    	* string/memrchr.c: Use string-fzb.h, string-fzi.h.
    	* sysdeps/i386/i686/multiarch/memrchr-c.c: Redefined weak_alias.
    	* sysdeps/s390/multiarch/memrchr-c.c: Likewise.

https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=15b7238287bebef55d6ef2191766990713d9b977

commit 15b7238287bebef55d6ef2191766990713d9b977
Author: Richard Henderson <rth@twiddle.net>
Date:   Thu Feb 16 16:19:12 2017 -0200

    string: Improve generic memchr
    
    New algorithm have the following key differences:
    
      - Reads first word unaligned and use string-maskoff function to
        remove unwanted data.  This strategy follow assemble optimized
        ones for aarch64, powerpc and tile.
    
      - Use string-fz{b,i} and string-opthr functions.
    
    Checked on x86_64-linux-gnu, i686-linux-gnu, sparc64-linux-gnu,
    and sparcv9-linux-gnu by removing the arch-specific assembly
    implementation and disabling multi-arch (it covers both LE and BE
    for 64 and 32 bits).
    
    	[BZ #5806]
    	* string/memchr.c: Use string-fzb.h, string-fzi.h, string-opthr.h.

https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=096c5ae2a75e1b8b641c97015f9deaee5a17b2cc

commit 096c5ae2a75e1b8b641c97015f9deaee5a17b2cc
Author: Richard Henderson <rth@twiddle.net>
Date:   Thu Feb 16 16:18:24 2017 -0200

    string: Improve generic strlen
    
    New algorithm have the following key differences:
    
      - Reads first word unaligned and use string-maskoff functions to
        remove unwanted data.  This strategy follow assemble optimized
        ones for powerpc, sparc, and SH.
    
      - Use of has_zero and index_first_zero parametrized functions.
    
    Checked on x86_64-linux-gnu, i686-linux-gnu, sparc64-linux-gnu,
    and sparcv9-linux-gnu by removing the arch-specific assembly
    implementation and disabling multi-arch (it covers both LE and BE
    for 64 and 32 bits).
    
    	[BZ #5806]
        	* string/strlen.c: Use them.

https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=c6766b5d52369eae2b64b4c0d23482afa76e0cd6

commit c6766b5d52369eae2b64b4c0d23482afa76e0cd6
Author: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Date:   Mon Jan 8 16:41:43 2018 -0200

    Add string vectorized find and detection functions
    
    This patch adds generic string find and detection implementation meant
    to be used in generic vectorized string implementation.  The idea is to
    decompose the basic string operation so each architecture can reimplement
    if it provides any specialized hardware instruction.
    
    The 'string-fza.h' provides zero byte detection functions (find_zero_low,
    find_zero_all, find_eq_low, find_eq_all, find_zero_eq_low, find_zero_eq_all,
    find_zero_ne_low, and find_zero_ne_all).  They are used on both functions
    provided by 'string-fzb.h' and 'string-fzi'.
    
    The 'string-fzb.h' provides boolean zero byte detection with the
    functions:
    
      - has_zero: determine if any byte within a word is zero.
      - has_eq: determine byte equality between two words.
      - has_zero_eq: determine if any byte within a word is zero along with
        byte equality between two words.
    
    The 'string-fzi.h' provides zero byte detection along with its positions:
    
      - index_first_zero: return index of first zero byte within a word.
      - index_first_eq: return index of first byte different between two words.
      - index_first_zero_eq: return index of first zero byte within a word or
        first byte different between two words.
      - index_first_zero_ne: return index of first zero byte within a word or
        first byte equal between two words.
      - index_last_zero: return index of last zero byte within a word.
      - index_last_eq: return index of last byte different between two words.
    
    Also, to avoid libcalls in the '__builtin_c{t,l}z{l}' calls (which may
    add performance degradation), inline implementation based on De Bruijn
    sequences are added (enabled by a configure check).
    
    	Richard Henderson  <rth@twiddle.net>
    	Adhemerval Zanella  <adhemerval.zanella@linaro.org>
    
    	* config.h.in (HAVE_BUILTIN_CTZ, HAVE_BUILTIN_CLZ): New defines.
    	* configure.ac: Check for __builtin_ctz{l} with no external
    	dependencies
    	* sysdeps/generic/string-extbyte.h: New file.
    	* sysdeps/generic/string-fza.h: Likewise.
    	* sysdeps/generic/string-fzb.h: Likewise.
    	* sysdeps/generic/string-fzi.h: Likewise.

https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=040bb0502a08765e01ba31708719e657a8c113b1

commit 040bb0502a08765e01ba31708719e657a8c113b1
Author: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Date:   Thu Feb 23 18:45:54 2017 -0300

    Add string-maskoff.h generic header
    
    Macros to operate on unaligned access for string operations:
    
      - create_mask: create a mask based on pointer alignment to sets up
        non-zero bytes before the beginning of the word so a following
        operation (such as find zero) might ignore these bytes.
    
      - highbit_mask: create a mask with high bit of each byte being 1,
        and the low 7 bits being all the opposite of the input.
    
    These macros are meant to be used on optimized vectorized string
    implementations.
    
    	Richard Henderson  <rth@twiddle.net>
    	Adhemerval Zanella  <adhemerval.zanella@linaro.org>
    
    	* sysdeps/generic/string-maskoff.h: New file.

https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=69030c746991d2d12d8b2c37ff84d350d6c412ee

commit 69030c746991d2d12d8b2c37ff84d350d6c412ee
Author: Richard Henderson <rth@twiddle.net>
Date:   Thu Feb 16 16:15:27 2017 -0200

    Parameterize OP_T_THRES from memcopy.h
    
    Basically it moves OP_T_THRES out of memcopy.h to its own header
    and adjust each architecture that redefines it.
    
    Checked with a build and check with run-built-tests=no for all major
    Linux ABIs (alpha, aarch64, arm, hppa, i686, ia64, m68k, microblaze,
    mips, mips64, nios2, powerpc, powerpc64le, s390x, sh4, sparc64,
    tilegx, and x86_64).
    
    	Richard Henderson  <rth@twiddle.net>
    	Adhemerval Zanella  <adhemerval.zanella@linaro.org>
    
    	* sysdeps/generic/memcopy.h (OP_T_THRES): Move...
    	* sysdeps/generic/string-opthr.h: ... here; new file.
    	* sysdeps/i386/memcopy.h (OP_T_THRES): Move...
    	* sysdeps/i386/string-opthr.h: ... here; new file.
    	* sysdeps/m68k/memcopy.h (OP_T_THRES): Remove.
    	* string/memcmp.c (OP_T_THRES): Remove definition.
    	* sysdeps/powerpc/powerpc32/power4/memcopy.h (OP_T_THRES): Likewise.

https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=aa7c9fa3a21562f9abdacac629b6fcef64551d6a

commit aa7c9fa3a21562f9abdacac629b6fcef64551d6a
Author: Richard Henderson <rth@twiddle.net>
Date:   Thu Feb 16 16:14:09 2017 -0200

    Parameterize op_t from memcopy.h
    
    Basically moves op_t definition out to an specific header, adds
    the attribute 'may-alias', and cleanup its duplicated definitions.
    It lead to inclusion of tilegx32 gmp-mparam.h similar to x32 so
    op_t can be define as a long long (from _LONG_LONG_LIMB).
    
    Checked with a build and check with run-built-tests=no for all major
    Linux ABIs (alpha, aarch64, arm, hppa, i686, ia64, m68k, microblaze,
    mips, mips64, nios2, powerpc, powerpc64le, s390x, sh4, sparc64,
    tilegx, and x86_64).
    
    	Richard Henderson  <rth@twiddle.net>
    	Adhemerval Zanella  <adhemerval.zanella@linaro.org>
    
    	* sysdeps/generic/string-optype.h: New file.
    	* sysdeps/generic/memcopy.h: Include it.
    	* string/memcmp.c (op_t): Remove define.
    	* sysdeps/tile/memcmp.c (op_t): Likewise.
    	* sysdeps/tile/memcopy.h (op_t): Likewise.
    	* sysdeps/tile/tilegx32/gmp-mparam.h: New file.

-----------------------------------------------------------------------
Comment 9 Sourceware Commits 2018-01-15 10:55:42 UTC
This is an automated email from the git hooks/post-receive script. It was
generated because a ref change was pushed to the repository containing
the project "GNU C Library master sources".

The branch, azanella/generic-strings has been created
        at  776bf884cdc6d8b7560311db143d954da2e969bf (commit)

- Log -----------------------------------------------------------------
https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=776bf884cdc6d8b7560311db143d954da2e969bf

commit 776bf884cdc6d8b7560311db143d954da2e969bf
Author: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Date:   Tue Feb 21 17:14:16 2017 -0300

    sh: Add string-fzb.h
    
    Use the SH cmp/str on has_{zero,eq,zero_eq}.
    
    Checked on sh4-linux-gnu.
    
    	Adhemerval Zanella <adhemerval.zanella@linaro.org>
    
    	* sysdeps/sh/string-fzb.h: New file.

https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=3f4eb93d500527cb988bca89b3ca1c55eaa7528b

commit 3f4eb93d500527cb988bca89b3ca1c55eaa7528b
Author: Richard Henderson <rth@twiddle.net>
Date:   Thu Feb 16 16:26:18 2017 -0200

    powerpc: Add string-fza.h
    
    While ppc has the more important string functions in assembly,
    there are still a few generic routines used.
    
    Use the Power 6 CMPB insn for testing of zeros.
    
    Checked on powerpc64le-linux-gnu.
    
    	Richard Henderson  <rth@twiddle.net>
    
    	* sysdeps/powerpc/string-fza.h: New file.

https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=faf34eb8e865d099db877561b20d6bd82793ee5d

commit faf34eb8e865d099db877561b20d6bd82793ee5d
Author: Richard Henderson <rth@twiddle.net>
Date:   Thu Feb 16 16:24:23 2017 -0200

    arm: Add string-fza.h
    
    While arm has the more important string functions in assembly,
    there are still a few generic routines used.
    
    Use the UQSUB8 insn for testing of zeros.
    
    Checked on armv7-linux-gnueabihf
    
    	Richard Henderson  <rth@twiddle.net>
    
    	* sysdeps/arm/armv6t2/string-fza.h: New file.

https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=5115c06c369958195d7200f1fefadec6a7194c16

commit 5115c06c369958195d7200f1fefadec6a7194c16
Author: Richard Henderson <rth@twiddle.net>
Date:   Thu Feb 16 16:23:27 2017 -0200

    alpha: Add string-fzb.h and string-fzi.h
    
    While alpha has the more important string functions in assembly,
    there are still a few for find the generic routines are used.
    
    Use the CMPBGE insn, via the builtin, for testing of zeros.  Use a
    simplified expansion of __builtin_ctz when the insn isn't available.
    
    Checked on alpha-linux-gnu.
    
    	Richard Henderson  <rth@twiddle.net>
    
    	* sysdeps/alpha/string-fzb.h: New file.
    	* sysdeps/alpha/string-fzi.h: Likewise.

https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=eb974666a0cb7f1c834e23c598bd70ea5467d85b

commit eb974666a0cb7f1c834e23c598bd70ea5467d85b
Author: Richard Henderson <rth@twiddle.net>
Date:   Thu Feb 16 16:22:39 2017 -0200

    hppa: Add string-fzb.h and string-fzi.h
    
    Use UXOR,SBZ to test for a zero byte within a word.  While we can
    get semi-decent code out of asm-goto, we would do slightly better
    with a compiler builtin.
    
    For index_zero et al, sequential testing of bytes is less expensive than
    any tricks that involve a count-leading-zeros insn that we don't have.
    
    Checked on hppa-linux-gnu.
    
    	Richard Henderson  <rth@twiddle.net>
    
    	* sysdeps/hppa/string-fzb.h: New file.
    	* sysdeps/hppa/string-fzi.h: Likewise.

https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=44c7540c69f6c01d19e34b8c6d06dc8eabef8ef9

commit 44c7540c69f6c01d19e34b8c6d06dc8eabef8ef9
Author: Richard Henderson <rth@twiddle.net>
Date:   Thu Feb 16 16:22:02 2017 -0200

    hppa: Add memcopy.h
    
    GCC's combine pass cannot merge (x >> c | y << (32 - c)) into a
    double-word shift unless (1) the subtract is in the same basic block
    and (2) the result of the subtract is used exactly once.  Neither
    condition is true for any use of MERGE.
    
    By forcing the use of a double-word shift, we not only reduce
    contention on SAR, but also allow the setting of SAR to be hoisted
    outside of a loop.
    
    Checked on hppa-linux-gnu.
    
    	Richard Henderson  <rth@twiddle.net>
    
    	* sysdeps/hppa/memcopy.h: New file.

https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=fc8a77c5b25196847a405781ffe472ff9e8eb6ce

commit fc8a77c5b25196847a405781ffe472ff9e8eb6ce
Author: Adhemerval Zanella <adhemerval.zanella@linaro.com>
Date:   Wed Mar 8 16:56:17 2017 +0100

    string: Improve generic strcpy
    
    New generic implementation tries to use word operations along with
    the new string-fz{b,i} functions even for inputs with different
    alignments (with still uses aligned access plus merge operation
    to get a correct word by word comparison).
    
    Checked on x86_64-linux-gnu, i686-linux-gnu, sparc64-linux-gnu,
    and sparcv9-linux-gnu by removing the arch-specific assembly
    implementation and disabling multi-arch (it covers both LE and BE
    for 64 and 32 bits).
    
    	Richard Henderson  <rth@twiddle.net>
    	Adhemerval Zanella  <adhemerval.zanella@linaro.org>
    
    	* string/strcpy.c: Rewrite using memcopy.h, string-fzb.h,
            string-fzi.h.
    	* string/test-strcpy.c (test_main): Add more coverage.

https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=0c653b42a1f2ef3b52be833b56596aefa8ad5736

commit 0c653b42a1f2ef3b52be833b56596aefa8ad5736
Author: Richard Henderson <rth@twiddle.net>
Date:   Thu Feb 16 16:21:26 2017 -0200

    string: Improve generic strcmp
    
    New generic implementation tries to use word operations along with
    the new string-fz{b,i} functions even for inputs with different
    alignments (with still uses aligned access plus merge operation
    to get a correct word by word comparison).
    
    Checked on x86_64-linux-gnu, i686-linux-gnu, sparc64-linux-gnu,
    and sparcv9-linux-gnu by removing the arch-specific assembly
    implementation and disabling multi-arch (it covers both LE and BE
    for 64 and 32 bits).
    
    	Richard Henderson  <rth@twiddle.net>
    	Adhemerval Zanella  <adhemerval.zanella@linaro.org>
    
    	* string/strcmp.c: Rewrite using memcopy.h, string-fzb.h,
    	string-fzi.h.

https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=c79804d028a9ebd21de6ae1442ce706fa3c1d7c2

commit c79804d028a9ebd21de6ae1442ce706fa3c1d7c2
Author: Richard Henderson <rth@twiddle.net>
Date:   Thu Feb 16 16:19:40 2017 -0200

    string: Improve generic strchrnul
    
    New algorithm have the following key differences:
    
      - Reads first word unaligned and use string-maskoff function to
        remove unwanted data.  This strategy follow assemble optimized
        ones for aarch64, powerpc and tile.
    
      - Use string-fz{b,i} functions.
    
    Checked on x86_64-linux-gnu, i686-linux-gnu, sparc64-linux-gnu,
    and sparcv9-linux-gnu by removing the arch-specific assembly
    implementation and disabling multi-arch (it covers both LE and BE
    for 64 and 32 bits).
    
    	[BZ #5806]
    	* string/strchrnul.c: Use string-fzb.h, string-fzi.h.

https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=1751ee1c285284a28725b820c18f8d2b7b5b9258

commit 1751ee1c285284a28725b820c18f8d2b7b5b9258
Author: Richard Henderson <rth@twiddle.net>
Date:   Thu Feb 16 16:18:48 2017 -0200

    string: Improve generic strchr
    
    New algorithm have the following key differences:
    
      - Reads first word unaligned and use string-maskoff function to
        remove unwanted data.  This strategy follow assemble optimized
        ones for aarch64 and powerpc.
    
      - Use string-fz{b,i} and string-extbyte function.
    
    Checked on x86_64-linux-gnu, i686-linux-gnu, sparc64-linux-gnu,
    and sparcv9-linux-gnu by removing the arch-specific assembly
    implementation and disabling multi-arch (it covers both LE and BE
    for 64 and 32 bits).
    
    	Richard Henderson  <rth@twiddle.net>
    	Adhemerval Zanella  <adhemerval.zanella@linaro.org>
    
    	[BZ #5806]
    	* string/strchr.c: Use string-fzb.h, string-fzi.h, string-extbyte.h.
    	* sysdeps/s390/multiarch/strchr-c.c: Redefine weak_alias.

https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=6d2690c0cb8b6b73ff6eb1309d231467c688aafd

commit 6d2690c0cb8b6b73ff6eb1309d231467c688aafd
Author: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Date:   Thu Feb 16 16:21:03 2017 -0200

    string: Improve generic strnlen
    
    With an optimized memchr, new strnlen implementation basically calls
    memchr and adjust the result pointer value.
    
    It also cleanups the multiple inclusion by leaving the ifunc
    implementation to undef the weak_alias and libc_hidden_def.
    
    	Richard Henderson  <rth@twiddle.net>
    	Adhemerval Zanella  <adhemerval.zanella@linaro.org>
    
    	[BZ #5806]
    	* string/strnlen.c: Rewrite in terms of memchr.
    	* sysdeps/i386/i686/multiarch/strnlen-c.c: Redefine weak_alias
    	and libc_hidden_def.
    	* sysdeps/powerpc/powerpc32/power4/multiarch/strnlen-ppc32.c:
    	Likewise.
    	* sysdeps/s390/multiarch/strnlen-c.c: Likewise.

https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=6702dd345ea55ccb468684bd809835a6c719c198

commit 6702dd345ea55ccb468684bd809835a6c719c198
Author: Richard Henderson <rth@twiddle.net>
Date:   Thu Feb 16 16:20:35 2017 -0200

    string: Improve generic memrchr
    
    New algorithm have the following key differences:
    
      - Use string-fz{b,i} functions.
    
    It also cleanups the multiple inclusion by leaving the ifunc
    implementation to undef the weak_alias.
    
    Checked on x86_64-linux-gnu, i686-linux-gnu, sparc64-linux-gnu,
    and sparcv9-linux-gnu by removing the arch-specific assembly
    implementation and disabling multi-arch (it covers both LE and BE
    for 64 and 32 bits).
    
    	Richard Henderson  <rth@twiddle.net>
    	Adhemerval Zanella  <adhemerval.zanella@linaro.org>
    
    	[BZ #5806]
    	* string/memrchr.c: Use string-fzb.h, string-fzi.h.
    	* sysdeps/i386/i686/multiarch/memrchr-c.c: Redefined weak_alias.
    	* sysdeps/s390/multiarch/memrchr-c.c: Likewise.

https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=1c12833514864aba99b01b2f173e1a98ec1f9658

commit 1c12833514864aba99b01b2f173e1a98ec1f9658
Author: Richard Henderson <rth@twiddle.net>
Date:   Thu Feb 16 16:19:12 2017 -0200

    string: Improve generic memchr
    
    New algorithm have the following key differences:
    
      - Reads first word unaligned and use string-maskoff function to
        remove unwanted data.  This strategy follow assemble optimized
        ones for aarch64, powerpc and tile.
    
      - Use string-fz{b,i} and string-opthr functions.
    
    Checked on x86_64-linux-gnu, i686-linux-gnu, sparc64-linux-gnu,
    and sparcv9-linux-gnu by removing the arch-specific assembly
    implementation and disabling multi-arch (it covers both LE and BE
    for 64 and 32 bits).
    
    	[BZ #5806]
    	* string/memchr.c: Use string-fzb.h, string-fzi.h, string-opthr.h.

https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=96419fb9b7feee1c05dd99ba4afdc89d94ef4aad

commit 96419fb9b7feee1c05dd99ba4afdc89d94ef4aad
Author: Richard Henderson <rth@twiddle.net>
Date:   Thu Feb 16 16:18:24 2017 -0200

    string: Improve generic strlen
    
    New algorithm have the following key differences:
    
      - Reads first word unaligned and use string-maskoff functions to
        remove unwanted data.  This strategy follow assemble optimized
        ones for powerpc, sparc, and SH.
    
      - Use of has_zero and index_first_zero parametrized functions.
    
    Checked on x86_64-linux-gnu, i686-linux-gnu, sparc64-linux-gnu,
    and sparcv9-linux-gnu by removing the arch-specific assembly
    implementation and disabling multi-arch (it covers both LE and BE
    for 64 and 32 bits).
    
    	[BZ #5806]
        	* string/strlen.c: Use them.

https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=438e5fcc0aff82752229bd88bbaffafc63ec6b81

commit 438e5fcc0aff82752229bd88bbaffafc63ec6b81
Author: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Date:   Mon Jan 8 16:41:43 2018 -0200

    Add string vectorized find and detection functions
    
    This patch adds generic string find and detection implementation meant
    to be used in generic vectorized string implementation.  The idea is to
    decompose the basic string operation so each architecture can reimplement
    if it provides any specialized hardware instruction.
    
    The 'string-fza.h' provides zero byte detection functions (find_zero_low,
    find_zero_all, find_eq_low, find_eq_all, find_zero_eq_low, find_zero_eq_all,
    find_zero_ne_low, and find_zero_ne_all).  They are used on both functions
    provided by 'string-fzb.h' and 'string-fzi'.
    
    The 'string-fzb.h' provides boolean zero byte detection with the
    functions:
    
      - has_zero: determine if any byte within a word is zero.
      - has_eq: determine byte equality between two words.
      - has_zero_eq: determine if any byte within a word is zero along with
        byte equality between two words.
    
    The 'string-fzi.h' provides zero byte detection along with its positions:
    
      - index_first_zero: return index of first zero byte within a word.
      - index_first_eq: return index of first byte different between two words.
      - index_first_zero_eq: return index of first zero byte within a word or
        first byte different between two words.
      - index_first_zero_ne: return index of first zero byte within a word or
        first byte equal between two words.
      - index_last_zero: return index of last zero byte within a word.
      - index_last_eq: return index of last byte different between two words.
    
    Also, to avoid libcalls in the '__builtin_c{t,l}z{l}' calls (which may
    add performance degradation), inline implementation based on De Bruijn
    sequences are added (enabled by a configure check).
    
    	Richard Henderson  <rth@twiddle.net>
    	Adhemerval Zanella  <adhemerval.zanella@linaro.org>
    
    	* config.h.in (HAVE_BUILTIN_CTZ, HAVE_BUILTIN_CLZ): New defines.
    	* configure.ac: Check for __builtin_ctz{l} with no external
    	dependencies
    	* sysdeps/generic/string-extbyte.h: New file.
    	* sysdeps/generic/string-fza.h: Likewise.
    	* sysdeps/generic/string-fzb.h: Likewise.
    	* sysdeps/generic/string-fzi.h: Likewise.

https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=447f15451d1761d9d1aad2b8440b19eb189e8811

commit 447f15451d1761d9d1aad2b8440b19eb189e8811
Author: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Date:   Thu Feb 23 18:45:54 2017 -0300

    Add string-maskoff.h generic header
    
    Macros to operate on unaligned access for string operations:
    
      - create_mask: create a mask based on pointer alignment to sets up
        non-zero bytes before the beginning of the word so a following
        operation (such as find zero) might ignore these bytes.
    
      - highbit_mask: create a mask with high bit of each byte being 1,
        and the low 7 bits being all the opposite of the input.
    
    These macros are meant to be used on optimized vectorized string
    implementations.
    
    	Richard Henderson  <rth@twiddle.net>
    	Adhemerval Zanella  <adhemerval.zanella@linaro.org>
    
    	* sysdeps/generic/string-maskoff.h: New file.

https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=c43ccd7a9f279b15f3e7241ecc35e49ea6330a81

commit c43ccd7a9f279b15f3e7241ecc35e49ea6330a81
Author: Richard Henderson <rth@twiddle.net>
Date:   Thu Feb 16 16:15:27 2017 -0200

    Parameterize OP_T_THRES from memcopy.h
    
    Basically it moves OP_T_THRES out of memcopy.h to its own header
    and adjust each architecture that redefines it.
    
    Checked with a build and check with run-built-tests=no for all major
    Linux ABIs (alpha, aarch64, arm, hppa, i686, ia64, m68k, microblaze,
    mips, mips64, nios2, powerpc, powerpc64le, s390x, sh4, sparc64,
    tilegx, and x86_64).
    
    	Richard Henderson  <rth@twiddle.net>
    	Adhemerval Zanella  <adhemerval.zanella@linaro.org>
    
    	* sysdeps/generic/memcopy.h (OP_T_THRES): Move...
    	* sysdeps/generic/string-opthr.h: ... here; new file.
    	* sysdeps/i386/memcopy.h (OP_T_THRES): Move...
    	* sysdeps/i386/string-opthr.h: ... here; new file.
    	* sysdeps/m68k/memcopy.h (OP_T_THRES): Remove.
    	* string/memcmp.c (OP_T_THRES): Remove definition.
    	* sysdeps/powerpc/powerpc32/power4/memcopy.h (OP_T_THRES): Likewise.

https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=1cca35adf040a1a22f48138569709ab71fd5bb32

commit 1cca35adf040a1a22f48138569709ab71fd5bb32
Author: Richard Henderson <rth@twiddle.net>
Date:   Thu Feb 16 16:14:09 2017 -0200

    Parameterize op_t from memcopy.h
    
    Basically moves op_t definition out to an specific header, adds
    the attribute 'may-alias', and cleanup its duplicated definitions.
    It lead to inclusion of tilegx32 gmp-mparam.h similar to x32 so
    op_t can be define as a long long (from _LONG_LONG_LIMB).
    
    Checked with a build and check with run-built-tests=no for all major
    Linux ABIs (alpha, aarch64, arm, hppa, i686, ia64, m68k, microblaze,
    mips, mips64, nios2, powerpc, powerpc64le, s390x, sh4, sparc64,
    tilegx, and x86_64).
    
    	Richard Henderson  <rth@twiddle.net>
    	Adhemerval Zanella  <adhemerval.zanella@linaro.org>
    
    	* sysdeps/generic/string-optype.h: New file.
    	* sysdeps/generic/memcopy.h: Include it.
    	* string/memcmp.c (op_t): Remove define.
    	* sysdeps/tile/memcmp.c (op_t): Likewise.
    	* sysdeps/tile/memcopy.h (op_t): Likewise.
    	* sysdeps/tile/tilegx32/gmp-mparam.h: New file.

-----------------------------------------------------------------------
Comment 10 Marc Mongenet 2018-09-30 01:12:59 UTC
The comment is still wrong, but for another reason:
In 71a5bd3e177e7748cf8993b0577d65d8986b44bc Ulrich Drepper <drepper@redhat.com>  2009-03-15 10:03:38 replaced the implementation, but not the comments.

Today, the code is the hack taken from Alan Mycroft's HAKMEMC postings, but the comments describes an old implementation.

It could be fixed be doing what Stas Yakovlev proposed on 2008-04-17 19:55:59 IST in https://sourceware.org/bugzilla/attachment.cgi?id=2703&action=diff.
Comment 11 jsm-csl@polyomino.org.uk 2018-09-30 13:13:58 UTC
In my view, the right approach for this issue is a general overhaul of the 
generic string functions as in Richard Henderson's / Adhemerval Zanella's 
patchset <https://sourceware.org/ml/libc-alpha/2018-01/msg00318.html> 
(that patchset may not address all instances of that comment, but it 
provides the infrastructure for doing so).
Comment 12 Sourceware Commits 2018-10-02 22:46:58 UTC
This is an automated email from the git hooks/post-receive script. It was
generated because a ref change was pushed to the repository containing
the project "GNU C Library master sources".

The branch, azanella/generic-strings has been created
        at  af69c5ba72b80b2bc937243801349eb197ad5553 (commit)

- Log -----------------------------------------------------------------
https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=af69c5ba72b80b2bc937243801349eb197ad5553

commit af69c5ba72b80b2bc937243801349eb197ad5553
Author: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Date:   Tue Feb 21 17:14:16 2017 -0300

    sh: Add string-fzb.h
    
    Use the SH cmp/str on has_{zero,eq,zero_eq}.
    
    Checked on sh4-linux-gnu.
    
    	Adhemerval Zanella <adhemerval.zanella@linaro.org>
    
    	* sysdeps/sh/string-fzb.h: New file.

https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=16f899cfa103a17c08c3b2f61a8d40f4f5e914a6

commit 16f899cfa103a17c08c3b2f61a8d40f4f5e914a6
Author: Richard Henderson <rth@twiddle.net>
Date:   Thu Feb 16 16:26:18 2017 -0200

    powerpc: Add string-fza.h
    
    While ppc has the more important string functions in assembly,
    there are still a few generic routines used.
    
    Use the Power 6 CMPB insn for testing of zeros.
    
    Checked on powerpc64le-linux-gnu.
    
    	Richard Henderson  <rth@twiddle.net>
    
    	* sysdeps/powerpc/string-fza.h: New file.

https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=4de86538e47d0b7fe56678b22f05550900e3b87a

commit 4de86538e47d0b7fe56678b22f05550900e3b87a
Author: Richard Henderson <rth@twiddle.net>
Date:   Thu Feb 16 16:24:23 2017 -0200

    arm: Add string-fza.h
    
    While arm has the more important string functions in assembly,
    there are still a few generic routines used.
    
    Use the UQSUB8 insn for testing of zeros.
    
    Checked on armv7-linux-gnueabihf
    
    	Richard Henderson  <rth@twiddle.net>
    
    	* sysdeps/arm/armv6t2/string-fza.h: New file.

https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=38cd3552d8b6df18b14b4625ef29ae9e27f9a704

commit 38cd3552d8b6df18b14b4625ef29ae9e27f9a704
Author: Richard Henderson <rth@twiddle.net>
Date:   Thu Feb 16 16:23:27 2017 -0200

    alpha: Add string-fzb.h and string-fzi.h
    
    While alpha has the more important string functions in assembly,
    there are still a few for find the generic routines are used.
    
    Use the CMPBGE insn, via the builtin, for testing of zeros.  Use a
    simplified expansion of __builtin_ctz when the insn isn't available.
    
    Checked on alpha-linux-gnu.
    
    	Richard Henderson  <rth@twiddle.net>
    
    	* sysdeps/alpha/string-fzb.h: New file.
    	* sysdeps/alpha/string-fzi.h: Likewise.

https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=70ea2db81287ba26a33960568f325ebfbd7479ef

commit 70ea2db81287ba26a33960568f325ebfbd7479ef
Author: Richard Henderson <rth@twiddle.net>
Date:   Thu Feb 16 16:22:39 2017 -0200

    hppa: Add string-fzb.h and string-fzi.h
    
    Use UXOR,SBZ to test for a zero byte within a word.  While we can
    get semi-decent code out of asm-goto, we would do slightly better
    with a compiler builtin.
    
    For index_zero et al, sequential testing of bytes is less expensive than
    any tricks that involve a count-leading-zeros insn that we don't have.
    
    Checked on hppa-linux-gnu.
    
    	Richard Henderson  <rth@twiddle.net>
    
    	* sysdeps/hppa/string-fzb.h: New file.
    	* sysdeps/hppa/string-fzi.h: Likewise.

https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=7f84429390ba38906bc7438f0035f81d207f2ab0

commit 7f84429390ba38906bc7438f0035f81d207f2ab0
Author: Richard Henderson <rth@twiddle.net>
Date:   Thu Feb 16 16:22:02 2017 -0200

    hppa: Add memcopy.h
    
    GCC's combine pass cannot merge (x >> c | y << (32 - c)) into a
    double-word shift unless (1) the subtract is in the same basic block
    and (2) the result of the subtract is used exactly once.  Neither
    condition is true for any use of MERGE.
    
    By forcing the use of a double-word shift, we not only reduce
    contention on SAR, but also allow the setting of SAR to be hoisted
    outside of a loop.
    
    Checked on hppa-linux-gnu.
    
    	Richard Henderson  <rth@twiddle.net>
    
    	* sysdeps/hppa/memcopy.h: New file.

https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=c9181a2ed5c336e080c6751a117e2f5e533ca06a

commit c9181a2ed5c336e080c6751a117e2f5e533ca06a
Author: Adhemerval Zanella <adhemerval.zanella@linaro.com>
Date:   Wed Mar 8 16:56:17 2017 +0100

    string: Improve generic strcpy
    
    New generic implementation tries to use word operations along with
    the new string-fz{b,i} functions even for inputs with different
    alignments (with still uses aligned access plus merge operation
    to get a correct word by word comparison).
    
    Checked on x86_64-linux-gnu, i686-linux-gnu, sparc64-linux-gnu,
    and sparcv9-linux-gnu by removing the arch-specific assembly
    implementation and disabling multi-arch (it covers both LE and BE
    for 64 and 32 bits).
    
    	Richard Henderson  <rth@twiddle.net>
    	Adhemerval Zanella  <adhemerval.zanella@linaro.org>
    
    	* string/strcpy.c: Rewrite using memcopy.h, string-fzb.h,
            string-fzi.h.
    	* string/test-strcpy.c (test_main): Add more coverage.

https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=550081c1e4d6263b1b641a0ef50eb6002ada4019

commit 550081c1e4d6263b1b641a0ef50eb6002ada4019
Author: Richard Henderson <rth@twiddle.net>
Date:   Thu Feb 16 16:21:26 2017 -0200

    string: Improve generic strcmp
    
    New generic implementation tries to use word operations along with
    the new string-fz{b,i} functions even for inputs with different
    alignments (with still uses aligned access plus merge operation
    to get a correct word by word comparison).
    
    Checked on x86_64-linux-gnu, i686-linux-gnu, sparc64-linux-gnu,
    and sparcv9-linux-gnu by removing the arch-specific assembly
    implementation and disabling multi-arch (it covers both LE and BE
    for 64 and 32 bits).
    
    	Richard Henderson  <rth@twiddle.net>
    	Adhemerval Zanella  <adhemerval.zanella@linaro.org>
    
    	* string/strcmp.c: Rewrite using memcopy.h, string-fzb.h,
    	string-fzi.h.

https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=b9121635b413aa6584b385b7135803f868cf6673

commit b9121635b413aa6584b385b7135803f868cf6673
Author: Richard Henderson <rth@twiddle.net>
Date:   Thu Feb 16 16:19:40 2017 -0200

    string: Improve generic strchrnul
    
    New algorithm have the following key differences:
    
      - Reads first word unaligned and use string-maskoff function to
        remove unwanted data.  This strategy follow assemble optimized
        ones for aarch64, powerpc and tile.
    
      - Use string-fz{b,i} functions.
    
    Checked on x86_64-linux-gnu, i686-linux-gnu, sparc64-linux-gnu,
    and sparcv9-linux-gnu by removing the arch-specific assembly
    implementation and disabling multi-arch (it covers both LE and BE
    for 64 and 32 bits).
    
    	[BZ #5806]
    	* string/strchrnul.c: Use string-fzb.h, string-fzi.h.

https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=5e8d99f2a619b38f27092f286ad06ccf690f5d0b

commit 5e8d99f2a619b38f27092f286ad06ccf690f5d0b
Author: Richard Henderson <rth@twiddle.net>
Date:   Thu Feb 16 16:18:48 2017 -0200

    string: Improve generic strchr
    
    New algorithm have the following key differences:
    
      - Reads first word unaligned and use string-maskoff function to
        remove unwanted data.  This strategy follow assemble optimized
        ones for aarch64 and powerpc.
    
      - Use string-fz{b,i} and string-extbyte function.
    
    Checked on x86_64-linux-gnu, i686-linux-gnu, sparc64-linux-gnu,
    and sparcv9-linux-gnu by removing the arch-specific assembly
    implementation and disabling multi-arch (it covers both LE and BE
    for 64 and 32 bits).
    
    	Richard Henderson  <rth@twiddle.net>
    	Adhemerval Zanella  <adhemerval.zanella@linaro.org>
    
    	[BZ #5806]
    	* string/strchr.c: Use string-fzb.h, string-fzi.h, string-extbyte.h.
    	* sysdeps/s390/multiarch/strchr-c.c: Redefine weak_alias.

https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=df07959c3721e083e10aaa4c3c1b9a483f9955fa

commit df07959c3721e083e10aaa4c3c1b9a483f9955fa
Author: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Date:   Thu Feb 16 16:21:03 2017 -0200

    string: Improve generic strnlen
    
    With an optimized memchr, new strnlen implementation basically calls
    memchr and adjust the result pointer value.
    
    It also cleanups the multiple inclusion by leaving the ifunc
    implementation to undef the weak_alias and libc_hidden_def.
    
    	Richard Henderson  <rth@twiddle.net>
    	Adhemerval Zanella  <adhemerval.zanella@linaro.org>
    
    	[BZ #5806]
    	* string/strnlen.c: Rewrite in terms of memchr.
    	* sysdeps/i386/i686/multiarch/strnlen-c.c: Redefine weak_alias
    	and libc_hidden_def.
    	* sysdeps/powerpc/powerpc32/power4/multiarch/strnlen-ppc32.c:
    	Likewise.
    	* sysdeps/s390/multiarch/strnlen-c.c: Likewise.

https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=bdd041c15306e91a9cc4d26975e73a7a7d84742b

commit bdd041c15306e91a9cc4d26975e73a7a7d84742b
Author: Richard Henderson <rth@twiddle.net>
Date:   Thu Feb 16 16:20:35 2017 -0200

    string: Improve generic memrchr
    
    New algorithm have the following key differences:
    
      - Use string-fz{b,i} functions.
    
    It also cleanups the multiple inclusion by leaving the ifunc
    implementation to undef the weak_alias.
    
    Checked on x86_64-linux-gnu, i686-linux-gnu, sparc64-linux-gnu,
    and sparcv9-linux-gnu by removing the arch-specific assembly
    implementation and disabling multi-arch (it covers both LE and BE
    for 64 and 32 bits).
    
    	Richard Henderson  <rth@twiddle.net>
    	Adhemerval Zanella  <adhemerval.zanella@linaro.org>
    
    	[BZ #5806]
    	* string/memrchr.c: Use string-fzb.h, string-fzi.h.
    	* sysdeps/i386/i686/multiarch/memrchr-c.c: Redefined weak_alias.
    	* sysdeps/s390/multiarch/memrchr-c.c: Likewise.

https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=01c324107db27294a46141414fdc1094686f5508

commit 01c324107db27294a46141414fdc1094686f5508
Author: Richard Henderson <rth@twiddle.net>
Date:   Thu Feb 16 16:19:12 2017 -0200

    string: Improve generic memchr
    
    New algorithm have the following key differences:
    
      - Reads first word unaligned and use string-maskoff function to
        remove unwanted data.  This strategy follow assemble optimized
        ones for aarch64, powerpc and tile.
    
      - Use string-fz{b,i} and string-opthr functions.
    
    Checked on x86_64-linux-gnu, i686-linux-gnu, sparc64-linux-gnu,
    and sparcv9-linux-gnu by removing the arch-specific assembly
    implementation and disabling multi-arch (it covers both LE and BE
    for 64 and 32 bits).
    
    	[BZ #5806]
    	* string/memchr.c: Use string-fzb.h, string-fzi.h, string-opthr.h.

https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=c1dd2be8155f7277d5604f5d91fc2d4a0fc03fd4

commit c1dd2be8155f7277d5604f5d91fc2d4a0fc03fd4
Author: Richard Henderson <rth@twiddle.net>
Date:   Thu Feb 16 16:18:24 2017 -0200

    string: Improve generic strlen
    
    New algorithm have the following key differences:
    
      - Reads first word unaligned and use string-maskoff functions to
        remove unwanted data.  This strategy follow assemble optimized
        ones for powerpc, sparc, and SH.
    
      - Use of has_zero and index_first_zero parametrized functions.
    
    Checked on x86_64-linux-gnu, i686-linux-gnu, sparc64-linux-gnu,
    and sparcv9-linux-gnu by removing the arch-specific assembly
    implementation and disabling multi-arch (it covers both LE and BE
    for 64 and 32 bits).
    
    	[BZ #5806]
        	* string/strlen.c: Use them.

https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=4725ea128cef7de4faf023898b2b04f9176a2604

commit 4725ea128cef7de4faf023898b2b04f9176a2604
Author: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Date:   Mon Jan 8 16:41:43 2018 -0200

    Add string vectorized find and detection functions
    
    This patch adds generic string find and detection implementation meant
    to be used in generic vectorized string implementation.  The idea is to
    decompose the basic string operation so each architecture can reimplement
    if it provides any specialized hardware instruction.
    
    The 'string-fza.h' provides zero byte detection functions (find_zero_low,
    find_zero_all, find_eq_low, find_eq_all, find_zero_eq_low, find_zero_eq_all,
    find_zero_ne_low, and find_zero_ne_all).  They are used on both functions
    provided by 'string-fzb.h' and 'string-fzi'.
    
    The 'string-fzb.h' provides boolean zero byte detection with the
    functions:
    
      - has_zero: determine if any byte within a word is zero.
      - has_eq: determine byte equality between two words.
      - has_zero_eq: determine if any byte within a word is zero along with
        byte equality between two words.
    
    The 'string-fzi.h' provides zero byte detection along with its positions:
    
      - index_first_zero: return index of first zero byte within a word.
      - index_first_eq: return index of first byte different between two words.
      - index_first_zero_eq: return index of first zero byte within a word or
        first byte different between two words.
      - index_first_zero_ne: return index of first zero byte within a word or
        first byte equal between two words.
      - index_last_zero: return index of last zero byte within a word.
      - index_last_eq: return index of last byte different between two words.
    
    Also, to avoid libcalls in the '__builtin_c{t,l}z{l}' calls (which may
    add performance degradation), inline implementation based on De Bruijn
    sequences are added (enabled by a configure check).
    
    	Richard Henderson  <rth@twiddle.net>
    	Adhemerval Zanella  <adhemerval.zanella@linaro.org>
    
    	* config.h.in (HAVE_BUILTIN_CTZ, HAVE_BUILTIN_CLZ): New defines.
    	* configure.ac: Check for __builtin_ctz{l} with no external
    	dependencies
    	* sysdeps/generic/string-extbyte.h: New file.
    	* sysdeps/generic/string-fza.h: Likewise.
    	* sysdeps/generic/string-fzb.h: Likewise.
    	* sysdeps/generic/string-fzi.h: Likewise.

https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=eed029d7eeafd01c8488b4cbadef4ec9dad10164

commit eed029d7eeafd01c8488b4cbadef4ec9dad10164
Author: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Date:   Thu Feb 23 18:45:54 2017 -0300

    Add string-maskoff.h generic header
    
    Macros to operate on unaligned access for string operations:
    
      - create_mask: create a mask based on pointer alignment to sets up
        non-zero bytes before the beginning of the word so a following
        operation (such as find zero) might ignore these bytes.
    
      - highbit_mask: create a mask with high bit of each byte being 1,
        and the low 7 bits being all the opposite of the input.
    
    These macros are meant to be used on optimized vectorized string
    implementations.
    
    	Richard Henderson  <rth@twiddle.net>
    	Adhemerval Zanella  <adhemerval.zanella@linaro.org>
    
    	* sysdeps/generic/string-maskoff.h: New file.

https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=aedf9dd6eb0995c4f805b0e1a76509ad0379a46d

commit aedf9dd6eb0995c4f805b0e1a76509ad0379a46d
Author: Richard Henderson <rth@twiddle.net>
Date:   Thu Feb 16 16:15:27 2017 -0200

    Parameterize OP_T_THRES from memcopy.h
    
    Basically it moves OP_T_THRES out of memcopy.h to its own header
    and adjust each architecture that redefines it.
    
    Checked with a build and check with run-built-tests=no for all major
    Linux ABIs (alpha, aarch64, arm, hppa, i686, ia64, m68k, microblaze,
    mips, mips64, nios2, powerpc, powerpc64le, s390x, sh4, sparc64,
    tilegx, and x86_64).
    
    	Richard Henderson  <rth@twiddle.net>
    	Adhemerval Zanella  <adhemerval.zanella@linaro.org>
    
    	* sysdeps/generic/memcopy.h (OP_T_THRES): Move...
    	* sysdeps/generic/string-opthr.h: ... here; new file.
    	* sysdeps/i386/memcopy.h (OP_T_THRES): Move...
    	* sysdeps/i386/string-opthr.h: ... here; new file.
    	* sysdeps/m68k/memcopy.h (OP_T_THRES): Remove.
    	* string/memcmp.c (OP_T_THRES): Remove definition.
    	* sysdeps/powerpc/powerpc32/power4/memcopy.h (OP_T_THRES): Likewise.

https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=e5364d387c6a5436015f9083a2b26d68eab0b2ec

commit e5364d387c6a5436015f9083a2b26d68eab0b2ec
Author: Richard Henderson <rth@twiddle.net>
Date:   Thu Feb 16 16:14:09 2017 -0200

    Parameterize op_t from memcopy.h
    
    Basically moves op_t definition out to an specific header, adds
    the attribute 'may-alias', and cleanup its duplicated definitions.
    It lead to inclusion of tilegx32 gmp-mparam.h similar to x32 so
    op_t can be define as a long long (from _LONG_LONG_LIMB).
    
    Checked with a build and check with run-built-tests=no for all major
    Linux ABIs (alpha, aarch64, arm, hppa, i686, ia64, m68k, microblaze,
    mips, mips64, nios2, powerpc, powerpc64le, s390x, sh4, sparc64,
    tilegx, and x86_64).
    
    	Richard Henderson  <rth@twiddle.net>
    	Adhemerval Zanella  <adhemerval.zanella@linaro.org>
    
    	* sysdeps/generic/string-optype.h: New file.
    	* sysdeps/generic/memcopy.h: Include it.
    	* string/memcmp.c (op_t): Remove define.
    	* sysdeps/tile/memcmp.c (op_t): Likewise.
    	* sysdeps/tile/memcopy.h (op_t): Likewise.
    	* sysdeps/tile/tilegx32/gmp-mparam.h: New file.

-----------------------------------------------------------------------
Comment 13 Adhemerval Zanella 2018-10-02 22:48:30 UTC
(In reply to joseph@codesourcery.com from comment #11)
> In my view, the right approach for this issue is a general overhaul of the 
> generic string functions as in Richard Henderson's / Adhemerval Zanella's 
> patchset <https://sourceware.org/ml/libc-alpha/2018-01/msg00318.html> 
> (that patchset may not address all instances of that comment, but it 
> provides the infrastructure for doing so).

I updated my personal branch rebased against master (it removed tile changes and fixed some build issues).
Comment 14 Adhemerval Zanella 2023-07-30 12:56:18 UTC
The original comment is not present on generic implementation, only for some asm optimization for i386 (which I think we can just remove it).