This is the mail archive of the
libc-alpha@sourceware.org
mailing list for the glibc project.
RE: [PATCH 4/* v2] Optimize strchrnul more
- From: "Wilco Dijkstra" <wdijkstr at arm dot com>
- To: 'OndÅej BÃlka' <neleai at seznam dot cz>, <libc-alpha at sourceware dot org>
- Date: Tue, 26 May 2015 13:23:36 +0100
- Subject: RE: [PATCH 4/* v2] Optimize strchrnul more
- Authentication-results: sourceware.org; auth=none
- References: <20150524150715 dot GA31589 at domone> <20150524163214 dot GA28053 at domone> <20150524171036 dot GA20947 at domone>
> OndÅej BÃlka wrote:
> On Sun, May 24, 2015 at 06:32:14PM +0200, OndÅej BÃlka wrote:
> > Hi,
> > this is nontrivial optimization of string inlines.
> > First it decreases icache pressure as you don't need strchr.
It's not obvious to me that is the right thing to do. Generally it is best
to use the standard C90/C99 functions rather than infrequently used
non-standard ones. A quick grep of GLIBC shows strchr is used a lot more
than strchrnul, and 9 targets have an optimized strchr vs 5 for strchrnul.
Using strchrnul also means extra work compared to calling strchr at the
callsite (the *__r == c) as well as in strchrnul if the character is not
found (returning NULL is easier than computing the pointer to the
terminating nul character).
(note for string/strchr.c it seems reasonable to defer to strchrnul)
> Just realized that optimization there is silly way to find terminating
> zero. On x64 rawmemchr is around 50% slower than strlen so add rawmemchr
> special case that does just that.
Yes, given GCC doesn't optimize strchr (s, 0) into s + strlen (s), I agree
we should add this to the GLIBC headers.
> * string/bits/string2.h (strchrnul, rawmemchr): Add inline
> (strchr): Optimize.
>
>
> diff --git a/string/bits/string2.h b/string/bits/string2.h
> index 2fe67b3..8f1eb04 100644
> --- a/string/bits/string2.h
> +++ b/string/bits/string2.h
> @@ -108,18 +108,39 @@ __STRING2_COPY_TYPE (8);
> #endif
>
>
> -/* Return pointer to C in S. */
> -#ifndef _HAVE_STRING_ARCH_strchr
> +#ifndef _HAVE_STRING_ARCH_rawmemchr
> extern void *__rawmemchr (const void *__s, int __c);
> # if __GNUC_PREREQ (3, 2)
> -# define strchr(s, c) \
> +# define __rawmemchr(s, c) \
> + (__extension__ (__builtin_constant_p (c) && !__builtin_constant_p (s) \
> + && (c) == '\0' \
> + ? s + strlen (s) \
> + : __rawmemchr (s, c)))
> +# endif
> +#endif
> +
> +
> +
> +#ifndef _HAVE_STRING_ARCH_strchrnul
> +# if __GNUC_PREREQ (3, 2)
> +# define strchrnul(s, c) \
> (__extension__ (__builtin_constant_p (c) && !__builtin_constant_p (s) \
> && (c) == '\0' \
> ? (char *) __rawmemchr (s, c) \
Should use s + strlen (s) here too rather than rely on above define.
> - : __builtin_strchr (s, c)))
> + : strchrnul (s, c)))
> # endif
> #endif
>
> +/* Return pointer to C in S. */
> +#ifndef _HAVE_STRING_ARCH_strchr
> +# if __GNUC_PREREQ (3, 2)
> +# define strchr(s, c) \
> + (__extension__ ({ char *__r = strchrnul (s, c); \
> + *__r == c ? __r : NULL; }))
> +# endif
> +#endif
I think this should do the strlen optimization and only change into strchrnul
on targets that do have an optimized strchrnul implementation but not strchr.
Wilco