This is the mail archive of the
libc-alpha@sourceware.org
mailing list for the glibc project.
Re: [PATCH v3] faster strlen on x64
- From: Dmitrieva Liubov <liubov dot dmitrieva at gmail dot com>
- To: Ondřej Bílka <neleai at seznam dot cz>
- Cc: libc-alpha at sourceware dot org
- Date: Thu, 31 Jan 2013 15:40:44 +0400
- Subject: Re: [PATCH v3] faster strlen on x64
- References: <20130131095215.GA31998@domone.kolej.mff.cuni.cz>
Looks good to me.
I don't see format issues for this version.
Do you have strnlen performance data as your patch impacts strnlen also?
Can you please extract short performance review like average gain for
AMD, Atom, SNB, IVX, Haswell in %.
--
Liubov Dmitrieva
Software Engineer
Intel Corporation
2013/1/31 OndÅej BÃlka <neleai@seznam.cz>:
> Hi,
>
> Afetr testing by Liuba I prepared final version of my patch
> (attached and on neleai/strlen branch.).
>
> I used hooking to examine behaviour of implementations in wild, it can be
> downloaded on http://kam.mff.cuni.cz/~ondra/strlen_profile.tar.bz2
> (Run ./benchmarks for unit tests, read TODO as it is not complete.)
>
> No aditional failures on x64.
>
> Uses of strlen_* in strcat are inlined for now, optimizations will come
> after I deal with strcpy.
>
> It could be also use in linker, I split this functionality into
> additional patch.
>
> Ondra
>
> 2013-01-31 Ondrej Bilka <neleai@seznam.cz>
>
> * sysdeps/x86_64/strlen.S: Replace with new SSE2 based
> implementation which is faster on all x86_64 architectures.
> Tested on AMD, Intel Nehalem, Atom, SNB, IVB, Haswell.
> * sysdeps/x86_64/strnlen.S: Likewise.
>
> * sysdeps/x86_64/multiarch/Makefile (sysdep_routines):
> Remove all multiarch strlen and strnlen versions.
> * sysdeps/x86_64/multiarch/ifunc-impl-list.c: Update.
> Remove strlen and strnlen related parts.
>
> * sysdeps/x86_64/multiarch/strcat-sse2-unaligned.S: Update.
> Inline strlen part.
> * sysdeps/x86_64/multiarch/strcat-ssse3.S: Likewise.
>
> * sysdeps/x86_64/multiarch/strlen.S: Remove.
> * sysdeps/x86_64/multiarch/strlen-sse2-no-bsf.S: Remove.
> * sysdeps/x86_64/multiarch/strlen-sse2-pminub.S: Remove.
> * sysdeps/x86_64/multiarch/rtld-strlen.S: Remove.
> * sysdeps/x86_64/multiarch/strlen-sse4.S: Remove.
> * sysdeps/x86_64/multiarch/strnlen.S: Remove.
> * sysdeps/x86_64/multiarch/strnlen-sse2-no-bsf.S: Remove.