This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

Re: [PATCH] x86-64: Optimize memrchr with AVX2

From: "H.J. Lu" <hjl dot tools at gmail dot com>
To: GNU C Library <libc-alpha at sourceware dot org>
Date: Fri, 9 Jun 2017 04:34:36 -0700
Subject: Re: [PATCH] x86-64: Optimize memrchr with AVX2
Authentication-results: sourceware.org; auth=none
References: <20170601181331.GC28627@lucon.org> <CAMe9rOoioz=Yxb4Hr-HBpGwgYEMkLqpCsXJ1Tbj3dC7vf1MN=A@mail.gmail.com>

On Fri, Jun 2, 2017 at 12:50 PM, H.J. Lu <hjl.tools@gmail.com> wrote:
> On Thu, Jun 1, 2017 at 11:13 AM, H.J. Lu <hongjiu.lu@intel.com> wrote:
>> Optimize memrchr with AVX2 to search 32 bytes with a single vector
>> compare instruction.  It is as fast as SSE2 memrchr for small data
>> sizes and up to 1X faster for large data sizes on Haswell.  Select
>> AVX2 memrchr on AVX2 machines where vzeroupper is preferred and AVX
>> unaligned load is fast.
>>
>> Any comments?
>>
>> H.J.
>> --
>>         * sysdeps/x86_64/multiarch/Makefile (sysdep_routines): Add
>>         memrchr-avx2.
>>         * sysdeps/x86_64/multiarch/ifunc-impl-list.c
>>         (__libc_ifunc_impl_list): Add tests for __memrchr_avx2 and
>>         __memrchr_sse2.
>>         * sysdeps/x86_64/multiarch/memrchr-avx2.S: New file.
>>         * sysdeps/x86_64/multiarch/memrchr.S: Likewise.
>
> Updated patch with IFUNC selector in C.
>

I will check it in.


-- 
H.J.

References:
- [PATCH] x86-64: Optimize memrchr with AVX2
  - From: H.J. Lu
- Re: [PATCH] x86-64: Optimize memrchr with AVX2
  - From: H.J. Lu

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]