This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH 2/2] Improve 64bit memcpy/memmove for Corei7 with avx2 instruction


> First it does not randomize size in any way. This will cause branches to
> be predicted and as branch prediction can account to 20% of time results
> you get will be 20% off.
Ling: Because "A widely held rule of thumb is that a program spends
90% of its execution time in only 10% of the code",  so hardware
implemented  branch prediction mechanism, stable pattern history
provide benchmark(SPEC 2000) with average 95% correct prediction,
fully reandom code will make it useless.

> Fox example as you ran
> ./memcpy-test-avx2-bench
> cpy frequency could be 800MHz
> then in
> ./memcpy-test-new-bench
> a governor can decide to switch to 2.5GHz making results above three
> times worse than they are.
Ling:  I can confirm it is not issue in my compare.html, but like to
send out double-check result.

Ondra, if we can test real benchmark, that will more approximate our
real world usage. So some people know good memcpy benchmarks which
represent the real world applications, and could you please tell us ?

Thanks & Best Regards
Ling


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]