This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

Re: [PATCH v2] x86-64: AVX2 Optimize strcat/strcpy/stpcpy routines

From: Leonardo Sandoval <leonardo dot sandoval dot gonzalez at linux dot intel dot com>
To: Florian Weimer <fweimer at redhat dot com>
Cc: libc-alpha at sourceware dot org
Date: Fri, 07 Dec 2018 12:27:59 -0600
Subject: Re: [PATCH v2] x86-64: AVX2 Optimize strcat/strcpy/stpcpy routines
References: <20181205225839.25056-1-leonardo.sandoval.gonzalez@linux.intel.com> <874lbrdnza.fsf@oldenburg2.str.redhat.com> <6b7c5a9bd3541ab70176b1e56ce04acf783f45f8.camel@linux.intel.com> <874lbpcl4b.fsf@oldenburg2.str.redhat.com>

On Fri, 2018-12-07 at 18:17 +0100, Florian Weimer wrote:
> * Leonardo Sandoval:
> 
> > On Thu, 2018-12-06 at 10:05 +0100, Florian Weimer wrote:
> > > * leonardo sandoval gonzalez:
> > > 
> > > > Optimize strcat/strcpy/stpcpy routines and its max-offset
> > > > versions
> > > > with
> > > > AVX2. It uses vector comparison as much as possible. Observed
> > > > speedups
> > > > compared to sse2_unaligned:
> > > 
> > > Shouldn't we keep the stpcpy specialization from the original
> > > patch?  Or
> > > at least call the new strcpy from a wrapper function?  The
> > > generic
> > > function uses strlen plus memcpy, which I believe is slower than
> > > calling
> > > strcpy and recomputing the return value.
> > 
> > benchmarks numbers between old and new are basically the same, so I
> > rather stick with this version. The latter applies also so str?cat.
> > 
> > > How does this new strcpy compare against the old one for short
> > > strings?
> > 
> > strcpy did not change between versions, so we have the same
> > numbers.
> > strcpy code density was decrease considerably on this version
> > because
> > all preprocessor macros for non-strcpy functions were removed but
> > strcpy code remains the same.
> 
> Okay, the question then is: Why not use strlen + memcpy for strcpy,
> too?
> 

right. That I have not benchmarked. Let me work on that.


> Thanks,
> Florian

References:
- [PATCH v2] x86-64: AVX2 Optimize strcat/strcpy/stpcpy routines
  - From: leonardo . sandoval . gonzalez
- Re: [PATCH v2] x86-64: AVX2 Optimize strcat/strcpy/stpcpy routines
  - From: Florian Weimer
- Re: [PATCH v2] x86-64: AVX2 Optimize strcat/strcpy/stpcpy routines
  - From: Leonardo Sandoval
- Re: [PATCH v2] x86-64: AVX2 Optimize strcat/strcpy/stpcpy routines
  - From: Florian Weimer

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]