[PATCH v3 2/2] powerpc: Add optimized stpncpy for POWER9

Adhemerval Zanella adhemerval.zanella@linaro.org
Wed Sep 30 13:42:31 GMT 2020



On 29/09/2020 12:21, Raphael Moreira Zinsly via Libc-alpha wrote:
> Add stpncpy support into the POWER9 strncpy.

The benchmark numbers you provided [1] seems to show it is slight worse than
the generic_strncpy which uses the same strategy as string/strncpy.c 
(which would use VSX instruction through memset/memcpy).  Did you compare this
optimization against an implementation that just call power8/9 memset/memcpy
instead? 

It should resulting a smaller implementation which reduces i-cache size and
the code is much more simpler and maintainable.  The same applies for stpncpy.

I tried to dissuade Intel developers that such micro-optimization are not
really a real gain and instead we should optimize only a handful of string
operations (memcpy/memset/etc.) and use composable implementation instead
(as generic strncpy).  It still resulted on 1a153e47fcc, but I think we 
might do better for powerpc.

[1] https://sourceware.org/pipermail/libc-alpha/2020-September/118049.html


More information about the Libc-alpha mailing list