This is the mail archive of the glibc-bugs@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[Bug string/19776] Improve sysdeps/x86_64/multiarch/memcpy-sse2-unaligned.S


https://sourceware.org/bugzilla/show_bug.cgi?id=19776

--- Comment #10 from cvs-commit at gcc dot gnu.org <cvs-commit at gcc dot gnu.org> ---
This is an automated email from the git hooks/post-receive script. It was
generated because a ref change was pushed to the repository containing
the project "GNU C Library master sources".

The branch, hjl/erms/hybrid has been created
        at  6ce43b9abbfa089e80ae8fa3508b7ecdfb56c265 (commit)

- Log -----------------------------------------------------------------
https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=6ce43b9abbfa089e80ae8fa3508b7ecdfb56c265

commit 6ce43b9abbfa089e80ae8fa3508b7ecdfb56c265
Author: H.J. Lu <hjl.tools@gmail.com>
Date:   Fri Mar 18 12:36:03 2016 -0700

    Add Hybrid_ERMS and use it in memset.S

https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=9ad76210d60ab512bd329824ccf24add63c6a4b7

commit 9ad76210d60ab512bd329824ccf24add63c6a4b7
Author: H.J. Lu <hjl.tools@gmail.com>
Date:   Fri Mar 18 10:34:07 2016 -0700

    Add __memset_avx2_erms and __memset_chk_avx2_erms

https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=eeedc55e73367131c49cc84749006fd7cf0d2f41

commit eeedc55e73367131c49cc84749006fd7cf0d2f41
Author: H.J. Lu <hjl.tools@gmail.com>
Date:   Fri Mar 18 10:27:58 2016 -0700

    Add avx_unaligned_erms versions of memcpy/mempcpy

https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=21a60509c4d164f7c2ac21add165e9868da0aab0

commit 21a60509c4d164f7c2ac21add165e9868da0aab0
Author: H.J. Lu <hjl.tools@gmail.com>
Date:   Fri Mar 18 10:07:48 2016 -0700

    Remove mempcpy-*.S

https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=dce73065103d970a518ea7a869e218864c946198

commit dce73065103d970a518ea7a869e218864c946198
Author: H.J. Lu <hjl.tools@gmail.com>
Date:   Sun Mar 6 13:37:31 2016 -0800

    Merge memcpy with mempcpy

https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=ad023b590b32dcde6d1b641d41758e531290b6fd

commit ad023b590b32dcde6d1b641d41758e531290b6fd
Author: H.J. Lu <hjl.tools@gmail.com>
Date:   Fri Mar 18 09:22:56 2016 -0700

    Add __memset_sse2_erms and __memset_chk_sse2_erms

        * sysdeps/x86_64/memset.S (__memset_chk_sse2_erms): New
        function.
        (__memset_sse2_erms): Likewise.
        * sysdeps/x86_64/multiarch/ifunc-impl-list.c
        (__libc_ifunc_impl_list): Test __memset_chk_sse2_erms and
        __memset_sse2_erms.
        * sysdeps/x86_64/sysdep.h (REP_STOSB_THRESHOLD): New.

https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=f0bc5d68ff40abe2fe3ea1f2515811f9636f8ac9

commit f0bc5d68ff40abe2fe3ea1f2515811f9636f8ac9
Author: H.J. Lu <hjl.tools@gmail.com>
Date:   Fri Mar 18 08:32:05 2016 -0700

    Add sse2_unaligned_erms versions of memcpy/mempcpy

        * sysdeps/x86_64/multiarch/ifunc-impl-list.c
        (__libc_ifunc_impl_list): Test __memcpy_chk_sse2_unaligned_erms,
        __memcpy_sse2_unaligned_erms, __mempcpy_chk_sse2_unaligned_erms
        and __mempcpy_sse2_unaligned_erms.
        * sysdeps/x86_64/multiarch/memcpy-sse2-unaligned.S
        (__mempcpy_chk_sse2_unaligned_erms): New function.
        (__mempcpy_sse2_unaligned_erms): Likewise.
        (__memcpy_chk_sse2_unaligned_erms): Likewise.
        (__memcpy_sse2_unaligned_erms): Likewise.
        * sysdeps/x86_64/sysdep.h (REP_MOVSB_THRESHOLD): New.

https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=438ed4b59ada763de874ba3be10bb18ea52d5643

commit 438ed4b59ada763de874ba3be10bb18ea52d5643
Author: H.J. Lu <hjl.tools@gmail.com>
Date:   Mon Mar 7 05:47:26 2016 -0800

    Enable __memcpy_chk_sse2_unaligned

    Check Fast_Unaligned_Load for __memcpy_chk_sse2_unaligned. The new
    selection order is:

    1. __memcpy_chk_avx_unaligned if AVX_Fast_Unaligned_Load bit is set.
    2. __memcpy_chk_sse2_unaligned if Fast_Unaligned_Load bit is set.
    3. __memcpy_chk_sse2 if SSSE3 isn't available.
    4. __memcpy_chk_ssse3_back if Fast_Copy_Backward bit it set.
    5. __memcpy_chk_ssse3

        [BZ #19776]
        * sysdeps/x86_64/multiarch/mempcpy_chk.S (__mempcpy_chk): Check
        Fast_Unaligned_Load to enable __mempcpy_chk_sse2_unaligned.

https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=fe0ea5b0874b4ae4255c6ad3b2e0f1aaf94a97ff

commit fe0ea5b0874b4ae4255c6ad3b2e0f1aaf94a97ff
Author: H.J. Lu <hjl.tools@gmail.com>
Date:   Mon Mar 7 05:44:58 2016 -0800

    Enable __mempcpy_chk_sse2_unaligned

    Check Fast_Unaligned_Load for __mempcpy_chk_sse2_unaligned. The new
    selection order is:

    1. __mempcpy_chk_avx_unaligned if AVX_Fast_Unaligned_Load bit is set.
    2. __mempcpy_chk_sse2_unaligned if Fast_Unaligned_Load bit is set.
    3. __mempcpy_chk_sse2 if SSSE3 isn't available.
    4. __mempcpy_chk_ssse3_back if Fast_Copy_Backward bit it set.
    5. __mempcpy_chk_ssse3

        [BZ #19776]
        * sysdeps/x86_64/multiarch/mempcpy_chk.S (__mempcpy_chk): Check
        Fast_Unaligned_Load to enable __mempcpy_chk_sse2_unaligned.

https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=175e5a4dcf431f9820dd483b0ba93d165d8652b2

commit 175e5a4dcf431f9820dd483b0ba93d165d8652b2
Author: H.J. Lu <hjl.tools@gmail.com>
Date:   Mon Mar 7 05:42:46 2016 -0800

    Enable __mempcpy_sse2_unaligned

    Check Fast_Unaligned_Load for __mempcpy_sse2_unaligned.  The new
    selection order is:

    1. __mempcpy_avx_unaligned if AVX_Fast_Unaligned_Load bit is set.
    2. __mempcpy_sse2_unaligned if Fast_Unaligned_Load bit is set.
    3. __mempcpy_sse2 if SSSE3 isn't available.
    4. __mempcpy_ssse3_back if Fast_Copy_Backward bit it set.
    5. __mempcpy_ssse3

        [BZ #19776]
        * sysdeps/x86_64/multiarch/mempcpy.S (__mempcpy): Check
        Fast_Unaligned_Load to enable __mempcpy_sse2_unaligned.

https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=32920004ade2b9fd8ac3b3a0e177c0687b536a04

commit 32920004ade2b9fd8ac3b3a0e177c0687b536a04
Author: H.J. Lu <hjl.tools@gmail.com>
Date:   Sun Mar 6 17:06:41 2016 -0800

    Add entry points for __mempcpy_sse2_unaligned and _chk functions

    Add entry points for __mempcpy_chk_sse2_unaligned,
    __mempcpy_sse2_unaligned and __memcpy_chk_sse2_unaligned.

        [BZ #19776]
        * sysdeps/x86_64/multiarch/ifunc-impl-list.c
        (__libc_ifunc_impl_list): Test __memcpy_chk_sse2_unaligned,
        __mempcpy_chk_sse2_unaligned and __mempcpy_sse2_unaligned.
        * sysdeps/x86_64/multiarch/memcpy-sse2-unaligned.S
        (__mempcpy_chk_sse2_unaligned): New.
        (__mempcpy_sse2_unaligned): Likewise.
        (__memcpy_chk_sse2_unaligned): Likewise.
        (L(start): New label.

https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=12cf609fa5ffb50af921f634c6e548b50960948d

commit 12cf609fa5ffb50af921f634c6e548b50960948d
Author: H.J. Lu <hjl.tools@gmail.com>
Date:   Sun Mar 6 16:52:53 2016 -0800

    Remove L(overlapping) from memcpy-sse2-unaligned.S

    Since memcpy doesn't need to check overlapping source and destination,
    we can remove L(overlapping).

        [BZ #19776]
        * sysdeps/x86_64/multiarch/memcpy-sse2-unaligned.S
        (L(overlapping)): Removed.

https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=849e297b6803c989f33c05f5be8f93dbb3274bb1

commit 849e297b6803c989f33c05f5be8f93dbb3274bb1
Author: H.J. Lu <hjl.tools@gmail.com>
Date:   Sun Mar 6 13:46:54 2016 -0800

    Don't use RAX as scratch register

    To prepare sharing code with mempcpy, don't use RAX as scratch register
    so that RAX can be set to the return value at entrance.

        [BZ #19776]
        * sysdeps/x86_64/multiarch/memcpy-sse2-unaligned.S: Don't use
        RAX as scratch register.

https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=4891daae057485dfaa319971afd1d4fe52fbc929

commit 4891daae057485dfaa319971afd1d4fe52fbc929
Author: H.J. Lu <hjl.tools@gmail.com>
Date:   Sun Mar 6 14:16:32 2016 -0800

    Remove dead code from memcpy-sse2-unaligned.S

    There are

    ENTRY(__memcpy_sse2_unaligned)
       movq  %rsi, %rax
       leaq  (%rdx,%rdx), %rcx
       subq  %rdi, %rax
       subq  %rdx, %rax
       cmpq  %rcx, %rax
       jb L(overlapping)

    When branch is taken,

       cmpq  %rsi, %rdi
       jae   .L3

    will never be taken.  We can remove the dead code.

        [BZ #19776]
        * sysdeps/x86_64/multiarch/memcpy-sse2-unaligned.S (.L3) Removed.

https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=735a17465774fc651dfe26cf27e3cf6cdf1883f6

commit 735a17465774fc651dfe26cf27e3cf6cdf1883f6
Author: H.J. Lu <hjl.tools@gmail.com>
Date:   Fri Apr 11 08:51:16 2014 -0700

    Test 32-bit ERMS memcpy/memset

        * sysdeps/i386/i686/multiarch/ifunc-impl-list.c
        (__libc_ifunc_impl_list): Add __bcopy_erms, __bzero_erms,
        __memmove_chk_erms, __memmove_erms, __memset_chk_erms,
        __memset_erms, __memcpy_chk_erms, __memcpy_erms,
        __mempcpy_chk_erms and __mempcpy_erms.

https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=665d19ea1eaa11d1a37d2e8fe679f92075583bf8

commit 665d19ea1eaa11d1a37d2e8fe679f92075583bf8
Author: H.J. Lu <hjl.tools@gmail.com>
Date:   Fri Apr 11 08:25:17 2014 -0700

    Test 64-bit ERMS memcpy/memset

        * sysdeps/x86_64/multiarch/ifunc-impl-list.c
        (__libc_ifunc_impl_list): Add __memmove_chk_erms,
        __memmove_erms, __memset_erms, __memset_chk_erms,
        __memcpy_chk_erms, __memcpy_erms, __mempcpy_chk_erms and
        __mempcpy_erms.

https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=ef0a94fe0d1f51db1532b7f530841bf5ad2e2ba5

commit ef0a94fe0d1f51db1532b7f530841bf5ad2e2ba5
Author: H.J. Lu <hjl.tools@gmail.com>
Date:   Wed Sep 21 15:21:28 2011 -0700

    Add 32it ERMS memcpy/memset

        * sysdeps/i386/i686/multiarch/Makefile (sysdep_routines): Add
        bcopy-erms, memcpy-erms, memmove-erms, mempcpy-erms, bzero-erms
        and memset-erms.
        * sysdeps/i386/i686/multiarch/bcopy-erms.S: New file.
        * sysdeps/i386/i686/multiarch/bzero-erms.S: Likewise.
        * sysdeps/i386/i686/multiarch/memcpy-erms.S: Likewise.
        * sysdeps/i386/i686/multiarch/memmove-erms.S: Likewise.
        * sysdeps/i386/i686/multiarch/mempcpy-erms.S: Likewise.
        * sysdeps/i386/i686/multiarch/memset-erms.S: Likewise.
        * sysdeps/i386/i686/multiarch/ifunc-defines.sym: Add
        COMMON_CPUID_INDEX_7.
        * sysdeps/i386/i686/multiarch/bcopy.S: Enable ERMS optimization
        for Fast_ERMS.
        * sysdeps/i386/i686/multiarch/bzero.S: Likewise.
        * sysdeps/i386/i686/multiarch/memcpy.S: Likewise.
        * sysdeps/i386/i686/multiarch/memcpy_chk.S: Likewise.
        * sysdeps/i386/i686/multiarch/memmove.S: Likewise.
        * sysdeps/i386/i686/multiarch/memmove_chk.S: Likewise.
        * sysdeps/i386/i686/multiarch/mempcpy.S: Likewise.
        * sysdeps/i386/i686/multiarch/mempcpy_chk.S: Likewise.
        * sysdeps/i386/i686/multiarch/memset.S: Likewise.
        * sysdeps/i386/i686/multiarch/memset_chk.S: Likewise.

https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=ef81a3a93a95dd0017fcefe4207306d6875c3794

commit ef81a3a93a95dd0017fcefe4207306d6875c3794
Author: H.J. Lu <hjl.tools@gmail.com>
Date:   Thu Sep 15 16:16:10 2011 -0700

    Add 64-bit ERMS memcpy and memset

        * sysdeps/x86_64/multiarch/Makefile (sysdep_routines): Add
        memcpy-erms, mempcpy-erms, memmove-erms and memset-erms.
        * sysdeps/x86_64/multiarch/memcpy-erms.S: New.
        * sysdeps/x86_64/multiarch/memmove-erms.S: Likewise.
        * sysdeps/x86_64/multiarch/mempcpy-erms.S: Likewise.
        * sysdeps/x86_64/multiarch/memset-erms.S: Likewise.
        * sysdeps/x86_64/multiarch/memcpy.S: Enable ERMS optimization
        for Fast_ERMS.
        * sysdeps/x86_64/multiarch/memcpy_chk.S: Likewise.
        * sysdeps/x86_64/multiarch/memmove.c: Likewise.
        * sysdeps/x86_64/multiarch/memmove_chk.c: Likewise.
        * sysdeps/x86_64/multiarch/mempcpy.S: Likewise.
        * sysdeps/x86_64/multiarch/mempcpy_chk.S: Likewise.
        * sysdeps/x86_64/multiarch/memset.S: Likewise.
        * sysdeps/x86_64/multiarch/memset_chk.S: Likewise.

https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=cd8fa6c74a5f122a242c412b5ea526a31f13855a

commit cd8fa6c74a5f122a242c412b5ea526a31f13855a
Author: H.J. Lu <hjl.tools@gmail.com>
Date:   Thu Sep 15 15:47:01 2011 -0700

    Initial ERMS support

        * sysdeps/x86/cpu-features.h (bit_arch_Fast_ERMS): New.
        (bit_cpu_ERMS): Likewise.
        (index_cpu_ERMS): Likewise.
        (index_arch_Fast_ERMS): Likewise.
        (reg_ERMS): Likewise.

-----------------------------------------------------------------------

-- 
You are receiving this mail because:
You are on the CC list for the bug.

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]