This is the mail archive of the
libc-alpha@sourceware.org
mailing list for the glibc project.
Re: [COMMITED] faster memcpy on x64.
- From: Liubov Dmitrieva <liubov dot dmitrieva at gmail dot com>
- To: Ondřej Bílka <neleai at seznam dot cz>
- Cc: Andreas Jaeger <aj at suse dot com>, GNU C Library <libc-alpha at sourceware dot org>, "H.J. Lu" <hjl dot tools at gmail dot com>
- Date: Thu, 29 Aug 2013 18:54:21 +0400
- Subject: Re: [COMMITED] faster memcpy on x64.
- Authentication-results: sourceware.org; auth=none
- References: <20130427221620 dot GA16537 at domone dot kolej dot mff dot cuni dot cz> <518BB251 dot 7040602 at suse dot com> <CAHjhQ93bxAexzbymP6GN-08wLiu9mdxf2MoCXgqA1v-ONYJdFw at mail dot gmail dot com> <CAHjhQ90BGDy1XVWUQui8Tx7PzO0Y6pUAEBXHxTqXH8NAbBGvHw at mail dot gmail dot com> <51927FB1 dot 1070904 at suse dot com> <20130520081458 dot GB814 at domone dot kolej dot mff dot cuni dot cz> <CAHjhQ916H83byoyeNnSzMvd7nHqeUp=TqyMuQE0j8hjcAx7_tg at mail dot gmail dot com>
And also it is look very confusing that we don't have same unaligned
version for mempcpy and still use ssse3 version.
It is very easy to support mempcpy in memcpy-sse2-unaligned.S file.
--
Liubov
On Thu, Aug 29, 2013 at 6:45 PM, Liubov Dmitrieva
<liubov.dmitrieva@gmail.com> wrote:
> It looks like there is a confusion in the merged patch, I think it is
> supposed to be (at least looks more logical) the different flag, you
> only need to turn it on for Buldozer or whatever AMD machines the
> version is also good.
>
> diff --git a/sysdeps/x86_64/multiarch/memcpy.S
> b/sysdeps/x86_64/multiarch/memcpy.S
> index a1e5031..f6a44d2 100644
> --- a/sysdeps/x86_64/multiarch/memcpy.S
> +++ b/sysdeps/x86_64/multiarch/memcpy.S
> @@ -33,8 +33,8 @@ ENTRY(__new_memcpy)
> jne 1f
> call __init_cpu_features
> 1: leaq __memcpy_sse2(%rip), %rax
> - testl $bit_Slow_BSF,
> __cpu_features+FEATURE_OFFSET+index_Slow_BSF(%rip)
> - jnz 2f
> + testl $bit_Fast_Unaligned_Load,
> __cpu_features+FEATURE_OFFSET+index_Fast_Unaligned_Load(%rip)
> + jz 2f
> leaq __memcpy_sse2_unaligned(%rip), %rax
> ret
> 2: testl $bit_SSSE3, __cpu_features+CPUID_OFFSET+index_SSSE3(%rip)
>
>
> And you forgot to remove the version which is never used now as memcpy
> from the ifunc-impl-list:
>
>
> diff --git a/sysdeps/x86_64/multiarch/ifunc-impl-list.c
> b/sysdeps/x86_64/multiarch/ifunc-impl-list.c
> index 28d3579..d6a7f4f 100644
> --- a/sysdeps/x86_64/multiarch/ifunc-impl-list.c
> +++ b/sysdeps/x86_64/multiarch/ifunc-impl-list.c
> @@ -224,8 +224,6 @@ __libc_ifunc_impl_list (const char *name, struct
> libc_ifunc_impl *array,
>
> /* Support sysdeps/x86_64/multiarch/memcpy.S. */
> IFUNC_IMPL (i, name, memcpy,
> - IFUNC_IMPL_ADD (array, i, memcpy, HAS_SSSE3,
> - __memcpy_ssse3_back)
> IFUNC_IMPL_ADD (array, i, memcpy, HAS_SSSE3, __memcpy_ssse3)
> IFUNC_IMPL_ADD (array, i, memcpy, 1, __memcpy_sse2_unaligned)
> IFUNC_IMPL_ADD (array, i, memcpy, 1, __memcpy_sse2))
>
>
> --
> Liubov
>
> On Mon, May 20, 2013 at 12:14 PM, OndÅej BÃlka <neleai@seznam.cz> wrote:
>> Commited.