This is the mail archive of the
libc-alpha@sourceware.org
mailing list for the glibc project.
Re: [PATCH] powerpc: strstr optimization
- From: OndÅej BÃlka <neleai at seznam dot cz>
- To: munroesj at linux dot vnet dot ibm dot com
- Cc: Rajalakshmi Srinivasaraghavan <raji at linux dot vnet dot ibm dot com>, libc-alpha at sourceware dot org, sjmunroe at us dot ibm dot com
- Date: Tue, 12 May 2015 11:08:56 +0200
- Subject: Re: [PATCH] powerpc: strstr optimization
- Authentication-results: sourceware.org; auth=none
- References: <1429254584-124997-1-git-send-email-raji at linux dot vnet dot ibm dot com> <20150501120510 dot GB2203 at domone> <554704A1 dot 3010007 at linux dot vnet dot ibm dot com> <20150505211414 dot GA18085 at domone> <554A1703 dot 5090707 at linux dot vnet dot ibm dot com> <1431367848 dot 14012 dot 3 dot camel at sjmunroe-ThinkPad-W500>
On Mon, May 11, 2015 at 01:10:48PM -0500, Steven Munroe wrote:
> On Wed, 2015-05-06 at 18:58 +0530, Rajalakshmi Srinivasaraghavan wrote:
> >
> > On 05/06/2015 02:44 AM, OndÅej BÃlka wrote:
> > >
> > > It does. AFAIR for practical strings on x64 its only around 3 times slower than
> > > theoretical lower bound.
> > >
> > > A lower bound is strlen call as you must find terminating null when
> > > needle isn't present.
> > >
> > > I hope that main idea was clear and you will
> > > use it. Code I send is complex as it contains ten different speedups
> > > that improve performance.
> > >
> > >
> > >> In my optimization there are totally three branches.
> > >> case 1# if needle len >= 8 (common case)
> > >> case 2# if needle < 8 (common case)
> > >> case 3# For page boundary cross haystack. (rare case)
> > >>
> > > Checking if its case 1 or 2 is relatively costy as in practice haystacks
> > > tend to be short. Most apps that use strstr have 90% call haystack less than 64 bytes long.
> > >
> > > You don't need to know needle size if you know that it doesn't contain
> > > first pair. Also its rare that it contains first pair of characters twice.
> > strlen of needle is needed in all the cases so as to make sure haystack
> > is longer than needle
> > as part of initial check before proceeding to search. Once this check is
> > passed, strchr is
> > called to find the position of first character.
> >
> I think we should move on with (commit) Raji's patch as is.
>
> While we appreciate you suggestions Ondrej, not all optimization apply
> equally to all platforms.
No as you failed a basic item of contributor checklist. All performance
related patches need a benchmark to show its performance improvement and
not regression. A algorithm overview mentioned there is garbage. It
would likely cause performance regression in average case but I don't
have a powerpc machine to run a test and present numbers.