This is the mail archive of the
glibc-bugs@sourceware.org
mailing list for the glibc project.
[Bug string/19776] Improve sysdeps/x86_64/multiarch/memcpy-sse2-unaligned.S
- From: "cvs-commit at gcc dot gnu.org" <sourceware-bugzilla at sourceware dot org>
- To: glibc-bugs at sourceware dot org
- Date: Fri, 18 Mar 2016 16:35:02 +0000
- Subject: [Bug string/19776] Improve sysdeps/x86_64/multiarch/memcpy-sse2-unaligned.S
- Auto-submitted: auto-generated
- References: <bug-19776-131 at http dot sourceware dot org/bugzilla/>
https://sourceware.org/bugzilla/show_bug.cgi?id=19776
--- Comment #9 from cvs-commit at gcc dot gnu.org <cvs-commit at gcc dot gnu.org> ---
This is an automated email from the git hooks/post-receive script. It was
generated because a ref change was pushed to the repository containing
the project "GNU C Library master sources".
The branch, hjl/erms/hybrid has been created
at 0debc67b0128dab2a524b44387ff848bc01480bb (commit)
- Log -----------------------------------------------------------------
https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=0debc67b0128dab2a524b44387ff848bc01480bb
commit 0debc67b0128dab2a524b44387ff848bc01480bb
Author: H.J. Lu <hjl.tools@gmail.com>
Date: Fri Mar 18 09:22:56 2016 -0700
Add __memset_sse2_erms and __memset_chk_sse2_erms
* sysdeps/x86_64/memset.S (__memset_chk_sse2_erms): New
function.
(__memset_sse2_erms): Likewise.
* sysdeps/x86_64/multiarch/ifunc-impl-list.c
(__libc_ifunc_impl_list): Test __memset_chk_sse2_erms and
__memset_sse2_erms.
* sysdeps/x86_64/sysdep.h (REP_STOSB_THRESHOLD): New.
https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=f47ec52f2506c944b514c7da14b54d6aee485350
commit f47ec52f2506c944b514c7da14b54d6aee485350
Author: H.J. Lu <hjl.tools@gmail.com>
Date: Fri Mar 18 08:32:05 2016 -0700
Add sse2_unaligned_erms versions of memcpy/mempcpy
* sysdeps/x86_64/multiarch/ifunc-impl-list.c
(__libc_ifunc_impl_list): Test __memcpy_chk_sse2_unaligned_erms,
__memcpy_sse2_unaligned_erms, __mempcpy_chk_sse2_unaligned_erms
and __mempcpy_sse2_unaligned_erms.
* sysdeps/x86_64/multiarch/memcpy-sse2-unaligned.S
(__mempcpy_chk_sse2_unaligned_erms): New function.
(__mempcpy_sse2_unaligned_erms): Likewise.
(__memcpy_chk_sse2_unaligned_erms): Likewise.
(__memcpy_sse2_unaligned_erms): Likewise.
* sysdeps/x86_64/sysdep.h (REP_MOVSB_THRESHOLD): New.
https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=53363d1b76a45543f1ac9c1854a17d0a90bb3cba
commit 53363d1b76a45543f1ac9c1854a17d0a90bb3cba
Author: H.J. Lu <hjl.tools@gmail.com>
Date: Mon Mar 7 05:47:26 2016 -0800
Enable __memcpy_chk_sse2_unaligned
Check Fast_Unaligned_Load for __memcpy_chk_sse2_unaligned. The new
selection order is:
1. __memcpy_chk_avx_unaligned if AVX_Fast_Unaligned_Load bit is set.
2. __memcpy_chk_sse2_unaligned if Fast_Unaligned_Load bit is set.
3. __memcpy_chk_sse2 if SSSE3 isn't available.
4. __memcpy_chk_ssse3_back if Fast_Copy_Backward bit it set.
5. __memcpy_chk_ssse3
[BZ #19776]
* sysdeps/x86_64/multiarch/mempcpy_chk.S (__mempcpy_chk): Check
Fast_Unaligned_Load to enable __mempcpy_chk_sse2_unaligned.
https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=9615edc8835cb28b09d2941df19919ac7da29a38
commit 9615edc8835cb28b09d2941df19919ac7da29a38
Author: H.J. Lu <hjl.tools@gmail.com>
Date: Mon Mar 7 05:44:58 2016 -0800
Enable __mempcpy_chk_sse2_unaligned
Check Fast_Unaligned_Load for __mempcpy_chk_sse2_unaligned. The new
selection order is:
1. __mempcpy_chk_avx_unaligned if AVX_Fast_Unaligned_Load bit is set.
2. __mempcpy_chk_sse2_unaligned if Fast_Unaligned_Load bit is set.
3. __mempcpy_chk_sse2 if SSSE3 isn't available.
4. __mempcpy_chk_ssse3_back if Fast_Copy_Backward bit it set.
5. __mempcpy_chk_ssse3
[BZ #19776]
* sysdeps/x86_64/multiarch/mempcpy_chk.S (__mempcpy_chk): Check
Fast_Unaligned_Load to enable __mempcpy_chk_sse2_unaligned.
https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=e497a68bd68d08a2fffadbd9a5ed0c082cdc62e9
commit e497a68bd68d08a2fffadbd9a5ed0c082cdc62e9
Author: H.J. Lu <hjl.tools@gmail.com>
Date: Mon Mar 7 05:42:46 2016 -0800
Enable __mempcpy_sse2_unaligned
Check Fast_Unaligned_Load for __mempcpy_sse2_unaligned. The new
selection order is:
1. __mempcpy_avx_unaligned if AVX_Fast_Unaligned_Load bit is set.
2. __mempcpy_sse2_unaligned if Fast_Unaligned_Load bit is set.
3. __mempcpy_sse2 if SSSE3 isn't available.
4. __mempcpy_ssse3_back if Fast_Copy_Backward bit it set.
5. __mempcpy_ssse3
[BZ #19776]
* sysdeps/x86_64/multiarch/mempcpy.S (__mempcpy): Check
Fast_Unaligned_Load to enable __mempcpy_sse2_unaligned.
https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=f3397461af3990d1b147c7110b3fba6449b36400
commit f3397461af3990d1b147c7110b3fba6449b36400
Author: H.J. Lu <hjl.tools@gmail.com>
Date: Sun Mar 6 17:06:41 2016 -0800
Add entry points for __mempcpy_sse2_unaligned and _chk functions
Add entry points for __mempcpy_chk_sse2_unaligned,
__mempcpy_sse2_unaligned and __memcpy_chk_sse2_unaligned.
[BZ #19776]
* sysdeps/x86_64/multiarch/ifunc-impl-list.c
(__libc_ifunc_impl_list): Test __memcpy_chk_sse2_unaligned,
__mempcpy_chk_sse2_unaligned and __mempcpy_sse2_unaligned.
* sysdeps/x86_64/multiarch/memcpy-sse2-unaligned.S
(__mempcpy_chk_sse2_unaligned): New.
(__mempcpy_sse2_unaligned): Likewise.
(__memcpy_chk_sse2_unaligned): Likewise.
(L(start): New label.
https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=e923b259dfd17705954d098370b399c24dcef2cf
commit e923b259dfd17705954d098370b399c24dcef2cf
Author: H.J. Lu <hjl.tools@gmail.com>
Date: Sun Mar 6 16:52:53 2016 -0800
Remove L(overlapping) from memcpy-sse2-unaligned.S
Since memcpy doesn't need to check overlapping source and destination,
we can remove L(overlapping).
[BZ #19776]
* sysdeps/x86_64/multiarch/memcpy-sse2-unaligned.S
(L(overlapping)): Removed.
https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=af82e9a0269e61b459aaf071d3f93b35ecb11e9e
commit af82e9a0269e61b459aaf071d3f93b35ecb11e9e
Author: H.J. Lu <hjl.tools@gmail.com>
Date: Sun Mar 6 13:46:54 2016 -0800
Don't use RAX as scratch register
To prepare sharing code with mempcpy, don't use RAX as scratch register
so that RAX can be set to the return value at entrance.
[BZ #19776]
* sysdeps/x86_64/multiarch/memcpy-sse2-unaligned.S: Don't use
RAX as scratch register.
https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=d700853c5270817df932f319917467931f433c41
commit d700853c5270817df932f319917467931f433c41
Author: H.J. Lu <hjl.tools@gmail.com>
Date: Sun Mar 6 14:16:32 2016 -0800
Remove dead code from memcpy-sse2-unaligned.S
There are
ENTRY(__memcpy_sse2_unaligned)
movq %rsi, %rax
leaq (%rdx,%rdx), %rcx
subq %rdi, %rax
subq %rdx, %rax
cmpq %rcx, %rax
jb L(overlapping)
When branch is taken,
cmpq %rsi, %rdi
jae .L3
will never be taken. We can remove the dead code.
[BZ #19776]
* sysdeps/x86_64/multiarch/memcpy-sse2-unaligned.S (.L3) Removed.
-----------------------------------------------------------------------
--
You are receiving this mail because:
You are on the CC list for the bug.