This is the mail archive of the
libc-alpha@sourceware.org
mailing list for the glibc project.
Re: [PATCH] x86-64: Optimize strrchr/wcsrchr with AVX2
- From: Adhemerval Zanella <adhemerval dot zanella at linaro dot org>
- To: libc-alpha at sourceware dot org
- Date: Thu, 1 Jun 2017 15:42:55 -0300
- Subject: Re: [PATCH] x86-64: Optimize strrchr/wcsrchr with AVX2
- Authentication-results: sourceware.org; auth=none
- References: <20170601181401.GD28627@lucon.org>
On 01/06/2017 15:14, H.J. Lu wrote:
> Optimize strrchr/wcsrchr with AVX2 to check 32 bytes with vector
> instructions. It is as fast as SSE2 version for small data sizes
> and up to 1X faster for large data sizes on Haswell. Select AVX2
> version on AVX2 machines where vzeroupper is preferred and AVX
> unaligned load is fast.
>
> Any comments?
>
> H.J.
> --
> * sysdeps/x86_64/multiarch/Makefile (sysdep_routines): Add
> strrchr-avx2 and wcsrchr-avx2.
> * sysdeps/x86_64/multiarch/ifunc-impl-list.c
> (__libc_ifunc_impl_list): Add tests for __strrchr_avx2,
> __strrchr_sse2, __wcsrchr_avx2 and __wcsrchr_sse2.
> * sysdeps/x86_64/multiarch/strrchr-avx2.S: New file.
> * sysdeps/x86_64/multiarch/strrchr.S: Likewise.
I think this could be an opportunity to avoid adding more assembly ifunc
written in assembly and use C implementation instead.