Re: [PATCH v3] faster strlen on x64

Looks good to me.
I don't see format issues for this version.

Do you have strnlen performance data as your patch impacts strnlen also?

Can you please extract short performance review like average gain for
AMD, Atom, SNB, IVX, Haswell in %.

Liubov Dmitrieva
Software Engineer
Intel Corporation

2013/1/31 OndÅej BÃlka <>:
> Hi,
> Afetr testing by Liuba I prepared final version of my patch
> (attached and on neleai/strlen branch.).
> I used hooking to examine behaviour of implementations in wild, it can be
> downloaded on
> (Run ./benchmarks for unit tests, read TODO as it is not complete.)
> No aditional failures on x64.
> Uses of strlen_* in strcat are inlined for now, optimizations will come
> after I deal with strcpy.
> It could be also use in linker, I split this functionality into
> additional patch.
> Ondra
> 2013-01-31  Ondrej Bilka  <>
>         * sysdeps/x86_64/strlen.S: Replace with new SSE2 based
>         implementation which is faster on all x86_64 architectures.
>         Tested on AMD, Intel Nehalem, Atom, SNB, IVB, Haswell.
>         * sysdeps/x86_64/strnlen.S: Likewise.
>         * sysdeps/x86_64/multiarch/Makefile (sysdep_routines):
>         Remove all multiarch strlen and strnlen versions.
>         * sysdeps/x86_64/multiarch/ifunc-impl-list.c: Update.
>         Remove strlen and strnlen related parts.
>         * sysdeps/x86_64/multiarch/strcat-sse2-unaligned.S: Update.
>         Inline strlen part.
>         * sysdeps/x86_64/multiarch/strcat-ssse3.S: Likewise.
>         * sysdeps/x86_64/multiarch/strlen.S: Remove.
>         * sysdeps/x86_64/multiarch/strlen-sse2-no-bsf.S: Remove.
>         * sysdeps/x86_64/multiarch/strlen-sse2-pminub.S: Remove.
>         * sysdeps/x86_64/multiarch/rtld-strlen.S: Remove.
>         * sysdeps/x86_64/multiarch/strlen-sse4.S: Remove.
>         * sysdeps/x86_64/multiarch/strnlen.S: Remove.
>         * sysdeps/x86_64/multiarch/strnlen-sse2-no-bsf.S: Remove.

