This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH] powerpc: strcasestr optimization


On Mon, 2015-06-01 at 18:22 +0200, OndÅej BÃlka wrote:
> On Mon, Jun 01, 2015 at 09:36:47AM -0500, Steven Munroe wrote:
> > On Mon, 2015-06-01 at 14:28 +0200, OndÅej BÃlka wrote:
> > > On Mon, Jun 01, 2015 at 04:11:28PM +0530, Rajalakshmi Srinivasaraghavan wrote:
> > > > 
> > > > This patch optimizes strcasestr function for power >= 7 systems.
> > > > This patch uses optimized strlen and strnlen for calculating
> > > > string length and the average improvement of this optimization is ~40%.
> > > > This patch is tested on powerpc64 and powerpc64le.
> > > > Attached the benchresults with this new patch.
> > > > 
> > > Thats not enough. As strcasestr that I submited is around three times
> > > slower your implementation would likely be regression over generic one.
> > > 
> > > A problem here is that you use moronic algorithm. Fix algorithm first
> > > before trying to optimize it.
> > > 
> > 
> > This is not very helpful. You are demanding changes without clear
> > explanation and justification.
> > 
> > What is wrong with Raja's algorithm? What is insufficient in the
> > benchmark data she has provided? And why do you think your specific
> > design applies to PowerISA and POWER7/POWER8 micro-architecture.
> > 
> > What data do you have that justified this objection?
> 
> I replied on strstr patch thread on why what she submitted is
> performance regression. So I will repeat arguments from other thread
> which still apply.
> 
> First was problem with quadratic behaviour. She tried to fix it but it
> isn't a fix at all. Just benchmark 

Which is fixed in the latest patch
> 
> strcasestr ("aaa...(4000 times)...aaa", "aaa...(2000 times)...aab")
> 
> That call would take around hundred times than before which is
> unacceptable.
> 
> If we ignore that red flag second problem was that benchmark she used is
> bogus. It test with periodic haystacks, needle is copy of first bytes of
> haystack with last byte set to something else.
> 
Have you looked at he latest patch? If the benchmark provided does not
cover your assertion, then it is your responsibility to provide and
upstream the appropriate benchmark. And convince the community that
benchmark covers a valid issue.

> snip
> 
> As microarchitecture one doesn't need to know details to see that
> something is fundamentally wrong. A algorithm here is essentially
> byte-by-byte check which essentially does following
> 

That assertion is at odds with my experience. Every day I see code that
somebody thinks is optimized but in reality is not. Current processors
are vastly more complicated then most realize and RISC processors have
stronger interactions between compiler and micro-architecture.

So it is on you have to prove that your generic optimization actually
works for more then one platform.

> snip

> 
> As that was covered my final comment was that it isn't substantial
> speedup. My implemantation is generic and gives large boost 
> so why should be powerpc different.
> 
> As I don't have powerpc access now apply my patches
> 
> [PATCH v5] Generic string skeleton
> [PATCH v5 4*] Generic string search functions (strstr, strcasestr, memmem)
> 
Every one has access to the the GCC Compile farm 

https://gcc.gnu.org/wiki/CompileFarm

So there is no excuse to not have data to back you assertion.



Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]