This is the mail archive of the
libc-alpha@sourceware.org
mailing list for the glibc project.
Re: [COMMITED] faster memcpy on x64.
- From: Liubov Dmitrieva <liubov dot dmitrieva at gmail dot com>
- To: Ondřej Bílka <neleai at seznam dot cz>
- Cc: Andreas Jaeger <aj at suse dot com>, GNU C Library <libc-alpha at sourceware dot org>, "H.J. Lu" <hjl dot tools at gmail dot com>
- Date: Thu, 29 Aug 2013 18:45:13 +0400
- Subject: Re: [COMMITED] faster memcpy on x64.
- Authentication-results: sourceware.org; auth=none
- References: <20130427221620 dot GA16537 at domone dot kolej dot mff dot cuni dot cz> <518BB251 dot 7040602 at suse dot com> <CAHjhQ93bxAexzbymP6GN-08wLiu9mdxf2MoCXgqA1v-ONYJdFw at mail dot gmail dot com> <CAHjhQ90BGDy1XVWUQui8Tx7PzO0Y6pUAEBXHxTqXH8NAbBGvHw at mail dot gmail dot com> <51927FB1 dot 1070904 at suse dot com> <20130520081458 dot GB814 at domone dot kolej dot mff dot cuni dot cz>
It looks like there is a confusion in the merged patch, I think it is
supposed to be (at least looks more logical) the different flag, you
only need to turn it on for Buldozer or whatever AMD machines the
version is also good.
diff --git a/sysdeps/x86_64/multiarch/memcpy.S
b/sysdeps/x86_64/multiarch/memcpy.S
index a1e5031..f6a44d2 100644
--- a/sysdeps/x86_64/multiarch/memcpy.S
+++ b/sysdeps/x86_64/multiarch/memcpy.S
@@ -33,8 +33,8 @@ ENTRY(__new_memcpy)
jne 1f
call __init_cpu_features
1: leaq __memcpy_sse2(%rip), %rax
- testl $bit_Slow_BSF,
__cpu_features+FEATURE_OFFSET+index_Slow_BSF(%rip)
- jnz 2f
+ testl $bit_Fast_Unaligned_Load,
__cpu_features+FEATURE_OFFSET+index_Fast_Unaligned_Load(%rip)
+ jz 2f
leaq __memcpy_sse2_unaligned(%rip), %rax
ret
2: testl $bit_SSSE3, __cpu_features+CPUID_OFFSET+index_SSSE3(%rip)
And you forgot to remove the version which is never used now as memcpy
from the ifunc-impl-list:
diff --git a/sysdeps/x86_64/multiarch/ifunc-impl-list.c
b/sysdeps/x86_64/multiarch/ifunc-impl-list.c
index 28d3579..d6a7f4f 100644
--- a/sysdeps/x86_64/multiarch/ifunc-impl-list.c
+++ b/sysdeps/x86_64/multiarch/ifunc-impl-list.c
@@ -224,8 +224,6 @@ __libc_ifunc_impl_list (const char *name, struct
libc_ifunc_impl *array,
/* Support sysdeps/x86_64/multiarch/memcpy.S. */
IFUNC_IMPL (i, name, memcpy,
- IFUNC_IMPL_ADD (array, i, memcpy, HAS_SSSE3,
- __memcpy_ssse3_back)
IFUNC_IMPL_ADD (array, i, memcpy, HAS_SSSE3, __memcpy_ssse3)
IFUNC_IMPL_ADD (array, i, memcpy, 1, __memcpy_sse2_unaligned)
IFUNC_IMPL_ADD (array, i, memcpy, 1, __memcpy_sse2))
--
Liubov
On Mon, May 20, 2013 at 12:14 PM, OndÅej BÃlka <neleai@seznam.cz> wrote:
> Commited.