This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH v2] x86-64: AVX2 Optimize strcat/strcpy/stpcpy routines


On Fri, 2018-12-07 at 18:17 +0100, Florian Weimer wrote:
> * Leonardo Sandoval:
> 
> > On Thu, 2018-12-06 at 10:05 +0100, Florian Weimer wrote:
> > > * leonardo sandoval gonzalez:
> > > 
> > > > Optimize strcat/strcpy/stpcpy routines and its max-offset
> > > > versions
> > > > with
> > > > AVX2. It uses vector comparison as much as possible. Observed
> > > > speedups
> > > > compared to sse2_unaligned:
> > > 
> > > Shouldn't we keep the stpcpy specialization from the original
> > > patch?  Or
> > > at least call the new strcpy from a wrapper function?  The
> > > generic
> > > function uses strlen plus memcpy, which I believe is slower than
> > > calling
> > > strcpy and recomputing the return value.
> > 
> > benchmarks numbers between old and new are basically the same, so I
> > rather stick with this version. The latter applies also so str?cat.
> > 
> > > How does this new strcpy compare against the old one for short
> > > strings?
> > 
> > strcpy did not change between versions, so we have the same
> > numbers.
> > strcpy code density was decrease considerably on this version
> > because
> > all preprocessor macros for non-strcpy functions were removed but
> > strcpy code remains the same.
> 
> Okay, the question then is: Why not use strlen + memcpy for strcpy,
> too?
> 

right. That I have not benchmarked. Let me work on that.


> Thanks,
> Florian


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]