This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH 1/3] aarch64: Optimized memset specific to AmpereComputing emag


On 20/12/18 2:45 PM, Siddhesh Poyarekar wrote:
On 18/12/18 3:33 PM, Feng Xue wrote:
This version uses general register based memory store instead of
vector register based, for the former is faster than the latter
in emag.

Barring a couple of instances that show a 5%+ difference from __memset_generic (which also seems sporadic, maybe due to noise?), everything else seems to be in the 1-2% range.  Is that a significant enough difference to warrant a new variant?

It may not be worth adding another variant for a mere 1-2% overall gain for string routines but maybe I've misread the results and you have a better justification for this.  Please let me know if you do.

Ugh, I have in fact misread the results; sorry about that. The compare_strings.py script has the -b flag to change the baseline to compare against, so you can use that to show results relative to __memset_generic so that it's clearer.

Looks fine to me.

Siddhesh


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]