This is the mail archive of the glibc-bugs@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[Bug string/19776] Improve sysdeps/x86_64/multiarch/memcpy-sse2-unaligned.S


https://sourceware.org/bugzilla/show_bug.cgi?id=19776

--- Comment #9 from cvs-commit at gcc dot gnu.org <cvs-commit at gcc dot gnu.org> ---
This is an automated email from the git hooks/post-receive script. It was
generated because a ref change was pushed to the repository containing
the project "GNU C Library master sources".

The branch, hjl/erms/hybrid has been created
        at  0debc67b0128dab2a524b44387ff848bc01480bb (commit)

- Log -----------------------------------------------------------------
https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=0debc67b0128dab2a524b44387ff848bc01480bb

commit 0debc67b0128dab2a524b44387ff848bc01480bb
Author: H.J. Lu <hjl.tools@gmail.com>
Date:   Fri Mar 18 09:22:56 2016 -0700

    Add __memset_sse2_erms and __memset_chk_sse2_erms

        * sysdeps/x86_64/memset.S (__memset_chk_sse2_erms): New
        function.
        (__memset_sse2_erms): Likewise.
        * sysdeps/x86_64/multiarch/ifunc-impl-list.c
        (__libc_ifunc_impl_list): Test __memset_chk_sse2_erms and
        __memset_sse2_erms.
        * sysdeps/x86_64/sysdep.h (REP_STOSB_THRESHOLD): New.

https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=f47ec52f2506c944b514c7da14b54d6aee485350

commit f47ec52f2506c944b514c7da14b54d6aee485350
Author: H.J. Lu <hjl.tools@gmail.com>
Date:   Fri Mar 18 08:32:05 2016 -0700

    Add sse2_unaligned_erms versions of memcpy/mempcpy

        * sysdeps/x86_64/multiarch/ifunc-impl-list.c
        (__libc_ifunc_impl_list): Test __memcpy_chk_sse2_unaligned_erms,
        __memcpy_sse2_unaligned_erms, __mempcpy_chk_sse2_unaligned_erms
        and __mempcpy_sse2_unaligned_erms.
        * sysdeps/x86_64/multiarch/memcpy-sse2-unaligned.S
        (__mempcpy_chk_sse2_unaligned_erms): New function.
        (__mempcpy_sse2_unaligned_erms): Likewise.
        (__memcpy_chk_sse2_unaligned_erms): Likewise.
        (__memcpy_sse2_unaligned_erms): Likewise.
        * sysdeps/x86_64/sysdep.h (REP_MOVSB_THRESHOLD): New.

https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=53363d1b76a45543f1ac9c1854a17d0a90bb3cba

commit 53363d1b76a45543f1ac9c1854a17d0a90bb3cba
Author: H.J. Lu <hjl.tools@gmail.com>
Date:   Mon Mar 7 05:47:26 2016 -0800

    Enable __memcpy_chk_sse2_unaligned

    Check Fast_Unaligned_Load for __memcpy_chk_sse2_unaligned. The new
    selection order is:

    1. __memcpy_chk_avx_unaligned if AVX_Fast_Unaligned_Load bit is set.
    2. __memcpy_chk_sse2_unaligned if Fast_Unaligned_Load bit is set.
    3. __memcpy_chk_sse2 if SSSE3 isn't available.
    4. __memcpy_chk_ssse3_back if Fast_Copy_Backward bit it set.
    5. __memcpy_chk_ssse3

        [BZ #19776]
        * sysdeps/x86_64/multiarch/mempcpy_chk.S (__mempcpy_chk): Check
        Fast_Unaligned_Load to enable __mempcpy_chk_sse2_unaligned.

https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=9615edc8835cb28b09d2941df19919ac7da29a38

commit 9615edc8835cb28b09d2941df19919ac7da29a38
Author: H.J. Lu <hjl.tools@gmail.com>
Date:   Mon Mar 7 05:44:58 2016 -0800

    Enable __mempcpy_chk_sse2_unaligned

    Check Fast_Unaligned_Load for __mempcpy_chk_sse2_unaligned. The new
    selection order is:

    1. __mempcpy_chk_avx_unaligned if AVX_Fast_Unaligned_Load bit is set.
    2. __mempcpy_chk_sse2_unaligned if Fast_Unaligned_Load bit is set.
    3. __mempcpy_chk_sse2 if SSSE3 isn't available.
    4. __mempcpy_chk_ssse3_back if Fast_Copy_Backward bit it set.
    5. __mempcpy_chk_ssse3

        [BZ #19776]
        * sysdeps/x86_64/multiarch/mempcpy_chk.S (__mempcpy_chk): Check
        Fast_Unaligned_Load to enable __mempcpy_chk_sse2_unaligned.

https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=e497a68bd68d08a2fffadbd9a5ed0c082cdc62e9

commit e497a68bd68d08a2fffadbd9a5ed0c082cdc62e9
Author: H.J. Lu <hjl.tools@gmail.com>
Date:   Mon Mar 7 05:42:46 2016 -0800

    Enable __mempcpy_sse2_unaligned

    Check Fast_Unaligned_Load for __mempcpy_sse2_unaligned.  The new
    selection order is:

    1. __mempcpy_avx_unaligned if AVX_Fast_Unaligned_Load bit is set.
    2. __mempcpy_sse2_unaligned if Fast_Unaligned_Load bit is set.
    3. __mempcpy_sse2 if SSSE3 isn't available.
    4. __mempcpy_ssse3_back if Fast_Copy_Backward bit it set.
    5. __mempcpy_ssse3

        [BZ #19776]
        * sysdeps/x86_64/multiarch/mempcpy.S (__mempcpy): Check
        Fast_Unaligned_Load to enable __mempcpy_sse2_unaligned.

https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=f3397461af3990d1b147c7110b3fba6449b36400

commit f3397461af3990d1b147c7110b3fba6449b36400
Author: H.J. Lu <hjl.tools@gmail.com>
Date:   Sun Mar 6 17:06:41 2016 -0800

    Add entry points for __mempcpy_sse2_unaligned and _chk functions

    Add entry points for __mempcpy_chk_sse2_unaligned,
    __mempcpy_sse2_unaligned and __memcpy_chk_sse2_unaligned.

        [BZ #19776]
        * sysdeps/x86_64/multiarch/ifunc-impl-list.c
        (__libc_ifunc_impl_list): Test __memcpy_chk_sse2_unaligned,
        __mempcpy_chk_sse2_unaligned and __mempcpy_sse2_unaligned.
        * sysdeps/x86_64/multiarch/memcpy-sse2-unaligned.S
        (__mempcpy_chk_sse2_unaligned): New.
        (__mempcpy_sse2_unaligned): Likewise.
        (__memcpy_chk_sse2_unaligned): Likewise.
        (L(start): New label.

https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=e923b259dfd17705954d098370b399c24dcef2cf

commit e923b259dfd17705954d098370b399c24dcef2cf
Author: H.J. Lu <hjl.tools@gmail.com>
Date:   Sun Mar 6 16:52:53 2016 -0800

    Remove L(overlapping) from memcpy-sse2-unaligned.S

    Since memcpy doesn't need to check overlapping source and destination,
    we can remove L(overlapping).

        [BZ #19776]
        * sysdeps/x86_64/multiarch/memcpy-sse2-unaligned.S
        (L(overlapping)): Removed.

https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=af82e9a0269e61b459aaf071d3f93b35ecb11e9e

commit af82e9a0269e61b459aaf071d3f93b35ecb11e9e
Author: H.J. Lu <hjl.tools@gmail.com>
Date:   Sun Mar 6 13:46:54 2016 -0800

    Don't use RAX as scratch register

    To prepare sharing code with mempcpy, don't use RAX as scratch register
    so that RAX can be set to the return value at entrance.

        [BZ #19776]
        * sysdeps/x86_64/multiarch/memcpy-sse2-unaligned.S: Don't use
        RAX as scratch register.

https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=d700853c5270817df932f319917467931f433c41

commit d700853c5270817df932f319917467931f433c41
Author: H.J. Lu <hjl.tools@gmail.com>
Date:   Sun Mar 6 14:16:32 2016 -0800

    Remove dead code from memcpy-sse2-unaligned.S

    There are

    ENTRY(__memcpy_sse2_unaligned)
       movq  %rsi, %rax
       leaq  (%rdx,%rdx), %rcx
       subq  %rdi, %rax
       subq  %rdx, %rax
       cmpq  %rcx, %rax
       jb L(overlapping)

    When branch is taken,

       cmpq  %rsi, %rdi
       jae   .L3

    will never be taken.  We can remove the dead code.

        [BZ #19776]
        * sysdeps/x86_64/multiarch/memcpy-sse2-unaligned.S (.L3) Removed.

-----------------------------------------------------------------------

-- 
You are receiving this mail because:
You are on the CC list for the bug.

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]