This is the mail archive of the
libc-alpha@sourceware.org
mailing list for the glibc project.
Re: [PATCH] powerpc: strcasestr optimization
- From: Steven Munroe <munroesj at linux dot vnet dot ibm dot com>
- To: OndÅej BÃlka <neleai at seznam dot cz>
- Cc: Rajalakshmi Srinivasaraghavan <raji at linux dot vnet dot ibm dot com>, GNU C Library <libc-alpha at sourceware dot org>, Steve Munroe <sjmunroe at us dot ibm dot com>
- Date: Tue, 02 Jun 2015 13:47:03 -0500
- Subject: Re: [PATCH] powerpc: strcasestr optimization
- Authentication-results: sourceware.org; auth=none
- References: <55687597 dot 1060101 at linux dot vnet dot ibm dot com> <556C36D8 dot 2070208 at linux dot vnet dot ibm dot com> <20150601122830 dot GA14649 at domone> <1433169407 dot 10235 dot 5 dot camel at sjmunroe-ThinkPad-W500> <20150601162215 dot GA8955 at domone>
- Reply-to: munroesj at linux dot vnet dot ibm dot com
On Mon, 2015-06-01 at 18:22 +0200, OndÅej BÃlka wrote:
> On Mon, Jun 01, 2015 at 09:36:47AM -0500, Steven Munroe wrote:
> > On Mon, 2015-06-01 at 14:28 +0200, OndÅej BÃlka wrote:
> > > On Mon, Jun 01, 2015 at 04:11:28PM +0530, Rajalakshmi Srinivasaraghavan wrote:
> > > >
> > > > This patch optimizes strcasestr function for power >= 7 systems.
> > > > This patch uses optimized strlen and strnlen for calculating
> > > > string length and the average improvement of this optimization is ~40%.
> > > > This patch is tested on powerpc64 and powerpc64le.
> > > > Attached the benchresults with this new patch.
> > > >
> > > Thats not enough. As strcasestr that I submited is around three times
> > > slower your implementation would likely be regression over generic one.
> > >
> > > A problem here is that you use moronic algorithm. Fix algorithm first
> > > before trying to optimize it.
> > >
> >
> > This is not very helpful. You are demanding changes without clear
> > explanation and justification.
> >
> > What is wrong with Raja's algorithm? What is insufficient in the
> > benchmark data she has provided? And why do you think your specific
> > design applies to PowerISA and POWER7/POWER8 micro-architecture.
> >
> > What data do you have that justified this objection?
>
> I replied on strstr patch thread on why what she submitted is
> performance regression. So I will repeat arguments from other thread
> which still apply.
>
> First was problem with quadratic behaviour. She tried to fix it but it
> isn't a fix at all. Just benchmark
Which is fixed in the latest patch
>
> strcasestr ("aaa...(4000 times)...aaa", "aaa...(2000 times)...aab")
>
> That call would take around hundred times than before which is
> unacceptable.
>
> If we ignore that red flag second problem was that benchmark she used is
> bogus. It test with periodic haystacks, needle is copy of first bytes of
> haystack with last byte set to something else.
>
Have you looked at he latest patch? If the benchmark provided does not
cover your assertion, then it is your responsibility to provide and
upstream the appropriate benchmark. And convince the community that
benchmark covers a valid issue.
> snip
>
> As microarchitecture one doesn't need to know details to see that
> something is fundamentally wrong. A algorithm here is essentially
> byte-by-byte check which essentially does following
>
That assertion is at odds with my experience. Every day I see code that
somebody thinks is optimized but in reality is not. Current processors
are vastly more complicated then most realize and RISC processors have
stronger interactions between compiler and micro-architecture.
So it is on you have to prove that your generic optimization actually
works for more then one platform.
> snip
>
> As that was covered my final comment was that it isn't substantial
> speedup. My implemantation is generic and gives large boost
> so why should be powerpc different.
>
> As I don't have powerpc access now apply my patches
>
> [PATCH v5] Generic string skeleton
> [PATCH v5 4*] Generic string search functions (strstr, strcasestr, memmem)
>
Every one has access to the the GCC Compile farm
https://gcc.gnu.org/wiki/CompileFarm
So there is no excuse to not have data to back you assertion.