This is the mail archive of the
libc-alpha@sourceware.org
mailing list for the glibc project.
Re: [PATCH]: Optimization for strpbrk on PowerPC
- From: Adhemerval Zanella <azanella at linux dot vnet dot ibm dot com>
- To: libc-alpha at sourceware dot org
- Date: Thu, 27 Feb 2014 17:53:38 -0300
- Subject: Re: [PATCH]: Optimization for strpbrk on PowerPC
- Authentication-results: sourceware.org; auth=none
- References: <530F7D06 dot 4090309 at linux dot vnet dot ibm dot com> <20140227202420 dot GA20898 at domone dot podge>
On 27-02-2014 17:24, OndÅej BÃlka wrote:
> On Thu, Feb 27, 2014 at 11:29:34PM +0530, R Vidya wrote:
>> >From 2784371b29cf426603438fb5cac2b2a1f41940bd Mon Sep 17 00:00:00 2001
>> From: Vidya Ranganathan <vidya@linux.vnet.ibm.com>
>> Date: Thu, 27 Feb 2014 09:34:10 -0500
>> Subject: [PATCH] Optimization for strpbrk() on ppc64 and ppc64le. I have
>> attached the benchtest output to show the performance improvement.
>>
>> The optimization is achieved by following techniques:
>> 1.aligned memory access
>> 2.loop unrolling P7 gain
>> 3.CPU pre-fetch to avoid cache miss
>>
> These optimizations migth not be enough, a generic x64 uses standard
> trick of creating array from needle and then do lookup per charecter
> which should be faster when needle is say abcd...xyz. A bit optimized c
> implementation of that is below, I will send patch with that.
Qualifying as 'not be enough' is misleading, since the optimization is aiming
to be better than the one PowerPC currently uses (string/strpbrk.c). But your
suggestion in indeed valid and I will ask Vidya to take a look if it is worth
for PowerPC as well. But I'd like to address it in a future thread, since
this optimization is not to change the default algorithm, but rather to optimize
it for POWER.
>
> A main problem of optimizing strbrk is that you often spend more time in
> preprocessing than in actual matches. For small constant needles a separate
> implementation gets called.
>
> I have in my todo list to cache preprocessed needle information, it
> would allow match needles that consist from up to three character ranges
> quite quickly but is not feasible now due of long startup time.
>
> #include <stdlib.h>
> #include <stdint.h>
> #include <string.h>
> #undef strpbrk
>
> char *
> strpbrk (const char *_s, const char *n)
> {
> unsigned char *s = (unsigned char *) _s;
> size_t i;
> unsigned char chars[256];
> memset (chars, 0, 256);
> for (i = 0; n[i]; i++)
> chars[(unsigned char) n[i]] = 1;
> chars[0] = 1;
> i = 0;
> while (1)
> {
> if (chars[s[i++]] == 1)
> return s[i - 1] ? s + i - 1 : NULL;
>
> if (chars[s[i++]] == 1)
> return s[i - 1] ? s + i - 1 : NULL;
>
> if (chars[s[i++]] == 1)
> return s[i - 1] ? s + i - 1 : NULL;
>
> if (chars[s[i++]] == 1)
> return s[i - 1] ? s + i - 1 : NULL;
> }
> }
>