This is the mail archive of the
libc-alpha@sourceware.org
mailing list for the glibc project.
RE: [PATCH][AArch64] Add optimized memchr
- From: "Wilco Dijkstra" <wdijkstr at arm dot com>
- To: 'Ondřej Bílka' <neleai at seznam dot cz>
- Cc: "'GNU C Library'" <libc-alpha at sourceware dot org>
- Date: Mon, 28 Sep 2015 10:23:19 +0100
- Subject: RE: [PATCH][AArch64] Add optimized memchr
- Authentication-results: sourceware.org; auth=none
- References: <002d01d0f795$0ce77eb0$26b67c10$ at com> <20150926084544 dot GA31280 at domone>
> Ondřej Bílka wrote:
> On Fri, Sep 25, 2015 at 02:21:13PM +0100, Wilco Dijkstra wrote:
> > An optimized memchr was missing for AArch64. This version is similar to strchr and is
> significantly
> > faster than the C version. Passes GLIBC tests.
> >
> > OK for commit?
> >
> > ChangeLog:
> > 2015-09-25 Wilco Dijkstra <wdijkstr@arm.com>
> > 2015-09-25 Kevin Petit <kevin.petit@arm.com>
> >
> > * sysdeps/aarch64/memchr.S (__memchr): New file.
>
> How you tested performance. I think that also here loading first 32
> bytes unaligned should be better. Could you use dryrun to verify?
>
> Also same optimization could be used for memrchr.
I haven't tuned this at all, this is an existing implementation that
was added to Newlib last year but not yet ported to GLIBC.
For maximum performance doing the first 16/32 bytes unaligned will likely
be fastest just like it was for strlen.
Wilco