This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

Re: [PATCH 2/2] Improve 64bit memcpy/memmove for Corei7 with avx2 instruction

From: Ling Ma <ling dot ma dot program at gmail dot com>
To: Ondřej Bílka <neleai at seznam dot cz>
Cc: Nix <nix at esperi dot org dot uk>, libc-alpha at sourceware dot org, hongjiu dot lu at intel dot com
Date: Sat, 8 Jun 2013 00:12:56 +0800
Subject: Re: [PATCH 2/2] Improve 64bit memcpy/memmove for Corei7 with avx2 instruction
References: <1370424188-4259-1-git-send-email-ling dot ml at alibaba-inc dot com> <20130605121816 dot GA11269 at domone dot kolej dot mff dot cuni dot cz> <CAOGi=dMiD=_Qf1EJ=F3hfyQDtQubDEC5pjpXKDCHrUQwhr=vzg at mail dot gmail dot com> <20130605161954 dot GA26401 at domone dot kolej dot mff dot cuni dot cz> <CAOGi=dPWPaX5prcL-uAaqS6=_ehzKeBmAFMdwV6aU34jZ0eHtQ at mail dot gmail dot com> <20130606125511 dot GA28565 at domone dot kolej dot mff dot cuni dot cz> <CAOGi=dPs9geCtrWhU1L_0DEfOWOknpzFSLmYs4gbYzGX8Zn5Hg at mail dot gmail dot com> <20130607104613 dot GA6343 at domone dot kolej dot mff dot cuni dot cz> <8761xqru5w dot fsf at spindle dot srvr dot nix> <CAOGi=dMV5jaS2597cksd0mW84UDd06SovsBkL5=WPez-jZWg4g at mail dot gmail dot com> <20130607160749 dot GA28961 at domone dot kolej dot mff dot cuni dot cz>

> First it does not randomize size in any way. This will cause branches to
> be predicted and as branch prediction can account to 20% of time results
> you get will be 20% off.
Ling: Because "A widely held rule of thumb is that a program spends
90% of its execution time in only 10% of the code",  so hardware
implemented  branch prediction mechanism, stable pattern history
provide benchmark(SPEC 2000) with average 95% correct prediction,
fully reandom code will make it useless.

> Fox example as you ran
> ./memcpy-test-avx2-bench
> cpy frequency could be 800MHz
> then in
> ./memcpy-test-new-bench
> a governor can decide to switch to 2.5GHz making results above three
> times worse than they are.
Ling:  I can confirm it is not issue in my compare.html, but like to
send out double-check result.

Ondra, if we can test real benchmark, that will more approximate our
real world usage. So some people know good memcpy benchmarks which
represent the real world applications, and could you please tell us ?

Thanks & Best Regards
Ling

Follow-Ups:
- Re: [PATCH 2/2] Improve 64bit memcpy/memmove for Corei7 with avx2 instruction
  - From: OndÅej BÃlka

References:
- [PATCH 2/2] Improve 64bit memcpy/memmove for Corei7 with avx2 instruction
  - From: ling . ma . program
- Re: [PATCH 2/2] Improve 64bit memcpy/memmove for Corei7 with avx2 instruction
  - From: OndÅej BÃlka
- Re: [PATCH 2/2] Improve 64bit memcpy/memmove for Corei7 with avx2 instruction
  - From: Ling Ma
- Re: [PATCH 2/2] Improve 64bit memcpy/memmove for Corei7 with avx2 instruction
  - From: OndÅej BÃlka
- Re: [PATCH 2/2] Improve 64bit memcpy/memmove for Corei7 with avx2 instruction
  - From: Ling Ma
- Re: [PATCH 2/2] Improve 64bit memcpy/memmove for Corei7 with avx2 instruction
  - From: OndÅej BÃlka
- Re: [PATCH 2/2] Improve 64bit memcpy/memmove for Corei7 with avx2 instruction
  - From: Ling Ma
- Re: [PATCH 2/2] Improve 64bit memcpy/memmove for Corei7 with avx2 instruction
  - From: OndÅej BÃlka
- Re: [PATCH 2/2] Improve 64bit memcpy/memmove for Corei7 with avx2 instruction
  - From: Nix
- Re: [PATCH 2/2] Improve 64bit memcpy/memmove for Corei7 with avx2 instruction
  - From: OndÅej BÃlka

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]