On Thu, Mar 07, 2013 at 04:09:27PM +0100, Andreas Jaeger wrote:
On 03/07/2013 09:38 AM, Dmitrieva Liubov wrote:
Hello.
I've reproduced performance measurement using Ondrej benchmarks on the
IA CPUs the most important for us and can confirm that this patch is
fine.
It seems that this conversation was kept in private messages and
didn't get publicity.
I'm sorry for that.
Thanks for confirming that the patches are a real benefit - now
let's get them in a good shape (readable)...
Andreas
--
Andreas Jaeger aj@{suse.com,opensuse.org} Twitter/Identica: jaegerandi
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 NÃrnberg, Germany
GF: Jeff Hawn,Jennifer Guild,Felix ImendÃrffer,HRB16746 (AG NÃrnberg)
GPG fingerprint = 93A3 365E CE47 B889 DF7F FED1 389A 563C C272 A126
Here is new version as standalone and diff from previous.
I did one functional change: replace 64bit bsf by 32bit which is
definitely faster.
Rest are added comments.
* sysdeps/x86_64/strlen.S: Replace with new SSE2 based
implementation which is faster on all x86_64 architectures.
Tested on AMD, Intel Nehalem, SNB, IVB.
* sysdeps/x86_64/strnlen.S: Likewise.
* sysdeps/x86_64/multiarch/Makefile (sysdep_routines):
Remove all multiarch strlen and strnlen versions.
* sysdeps/x86_64/multiarch/ifunc-impl-list.c: Update.
Remove strlen and strnlen related parts.
* sysdeps/x86_64/multiarch/strcat-ssse3.S: Likewise.
* sysdeps/x86_64/strcat.S: Add comment.
* sysdeps/x86_64/multiarch/strlen.S: Remove file.
* sysdeps/x86_64/multiarch/strlen-sse2-no-bsf.S: Likewise.
* sysdeps/x86_64/multiarch/strlen-sse2-pminub.S: Likewise.
* sysdeps/x86_64/multiarch/strlen-sse4.S: Likewise.
* sysdeps/x86_64/multiarch/strnlen.S: Likewise.
* sysdeps/x86_64/multiarch/strnlen-sse2-no-bsf.S: Likewise.