This is the mail archive of the
libc-alpha@sourceware.org
mailing list for the glibc project.
Re: [PATCH 2/*] Optimize generic strchrnul and strchr
- From: OndÅej BÃlka <neleai at seznam dot cz>
- To: Joseph Myers <joseph at codesourcery dot com>
- Cc: Wilco Dijkstra <wdijkstr at arm dot com>, libc-alpha at sourceware dot org
- Date: Thu, 28 May 2015 19:54:12 +0200
- Subject: Re: [PATCH 2/*] Optimize generic strchrnul and strchr
- Authentication-results: sourceware.org; auth=none
- References: <000d01d09879$ae9c2d80$0bd48880$ at com> <alpine dot DEB dot 2 dot 10 dot 1505281733001 dot 16930 at digraph dot polyomino dot org dot uk>
On Thu, May 28, 2015 at 05:36:04PM +0000, Joseph Myers wrote:
> On Wed, 27 May 2015, Wilco Dijkstra wrote:
>
> > Finally first_nonzero_byte should just use __builtin_ffsl (yet another
> > function that should be inlined by default in the generic string.h...).
>
> Will GCC always inline __builtin_ffsl (or call a libgcc function) rather
> than generating a call to ffsl (user namespace) on some architectures? If
> it can ever call ffsl you need to do something similar to how we handle
> __mempcpy calling __builtin_mempcpy (include/string.h redeclares mempcpy
> with __asm__ ("__mempcpy"), so that libc-internal calls to __mempcpy
> really do call that function at the assembler level if not inlined, rather
> than calling mempcpy and having namespace issues).
>
However it doesn't do it that well, I reported bug about that somewhere.
So after all you need to make assembly dump and fix gcc mistakes.
It don't eliminate ureachable checks when you put zero there, like for
int
foo(int x)
{
if (!x)
return bar();
return __builtin_ffsl(x);
}
You get following assembly:
.cfi_startproc
testl %edi, %edi
je .L4
movslq %edi, %rax
movq $-1, %rdx
bsfq %rax, %rax
cmove %rdx, %rax
addq $1, %rax
ret