This is the mail archive of the
libc-alpha@sourceware.org
mailing list for the glibc project.
Re: [PATCH v2] x86-64: AVX2 Optimize strcat/strcpy/stpcpy routines
- From: Florian Weimer <fweimer at redhat dot com>
- To: leonardo dot sandoval dot gonzalez at linux dot intel dot com
- Cc: libc-alpha at sourceware dot org
- Date: Thu, 06 Dec 2018 10:05:29 +0100
- Subject: Re: [PATCH v2] x86-64: AVX2 Optimize strcat/strcpy/stpcpy routines
- References: <20181205225839.25056-1-leonardo.sandoval.gonzalez@linux.intel.com>
* leonardo sandoval gonzalez:
> Optimize strcat/strcpy/stpcpy routines and its max-offset versions with
> AVX2. It uses vector comparison as much as possible. Observed speedups
> compared to sse2_unaligned:
Shouldn't we keep the stpcpy specialization from the original patch? Or
at least call the new strcpy from a wrapper function? The generic
function uses strlen plus memcpy, which I believe is slower than calling
strcpy and recomputing the return value.
How does this new strcpy compare against the old one for short strings?
Thanks,
Florian