This is the mail archive of the
libc-alpha@sourceware.org
mailing list for the glibc project.
Re: [PATCH RFC V4] Improve 64bit memcpy/memove for Corei7 with unaligned avx instruction
- From: OndÅej BÃlka <neleai at seznam dot cz>
- To: Andreas Jaeger <aj at suse dot com>
- Cc: ling dot ma dot program at gmail dot com, libc-alpha at sourceware dot org, liubov dot dmitrieva at gmail dot com, Ma Ling <ling dot ml at alibaba-inc dot com>
- Date: Mon, 29 Jul 2013 12:05:19 +0200
- Subject: Re: [PATCH RFC V4] Improve 64bit memcpy/memove for Corei7 with unaligned avx instruction
- References: <1375090855-8312-1-git-send-email-ling dot ma dot program at gmail dot com> <51F63B77 dot 7020003 at suse dot com>
On Mon, Jul 29, 2013 at 11:52:55AM +0200, Andreas Jaeger wrote:
> On 07/29/2013 11:40 AM, ling.ma.program@gmail.com wrote:
> > From: Ma Ling <ling.ml@alibaba-inc.com>
> >
> > We manage to avoid branch instructions, and force destination to be aligned
> > with avx instruction, then modified gcc.403 so that we can only measure memcpy function,
> > gcc.403 benchmarks indicate the version improved performance from 4% to 14%
> > cmpaired with memcpy_sse2_unaligned on haswell machine.
> >
> > case avx_unaligned sse2_unaligned AVX vs SSE2
> > 200i 146833745 168384142 1.146767332
> > g23 1431207341 1557405243 1.088175835
> > 166i 350901531 379068674 1.08027079
> > cp-decl 370750774 395890196 1.067806796
> > c-type 763780824 810806468 1.061569553
> > expr2 986698539 1067232192 1.081619309
> > expr 727016829 758953883 1.043928906
> > s04 1117900758 1185159528 1.060165242
> > scilab 63309111 66893431 1.05661618
> > (We will send test patch on memcpy for above cases)
>
> Is memcpy_sse2_unaligned really the right function to compare with?
> Isn't __memcpy_ssse3 used on Haswell today?
>
It should be correct one unless ifunc selection was wrong. Does
haswell have bit_Slow_BSF bit set?
> Andreas
> --
> Andreas Jaeger aj@{suse.com,opensuse.org} Twitter/Identica: jaegerandi
> SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany
> GF: Jeff Hawn,Jennifer Guild,Felix Imendörffer,HRB16746 (AG Nürnberg)
> GPG fingerprint = 93A3 365E CE47 B889 DF7F FED1 389A 563C C272 A126
--
ether leak