[PATCH v3 2/2] powerpc: Add optimized stpncpy for POWER9

Raphael M Zinsly rzinsly@linux.ibm.com
Wed Sep 30 14:21:09 GMT 2020


Hi Adhemerval,

On 30/09/2020 10:42, Adhemerval Zanella wrote:
> 
> 
> On 29/09/2020 12:21, Raphael Moreira Zinsly via Libc-alpha wrote:
>> Add stpncpy support into the POWER9 strncpy.
> 
> The benchmark numbers you provided [1] seems to show it is slight worse than
> the generic_strncpy which uses the same strategy as string/strncpy.c
> (which would use VSX instruction through memset/memcpy).

My implementation is always better than the generic_strncpy, almost 
three times better in average. And it calls memset as well.

Are you talking about __strncpy_ppc? For some reason it is using 
strnlen_ppc instead of the strnlen_power8, but I didn't touch it.

> Did you compare this
> optimization against an implementation that just call power8/9 memset/memcpy
> instead?
> 

Not sure if I understand, isn't that generic_strncpy and strncpy_ppc?


> It should resulting a smaller implementation which reduces i-cache size and
> the code is much more simpler and maintainable.  The same applies for stpncpy.
> 
> I tried to dissuade Intel developers that such micro-optimization are not
> really a real gain and instead we should optimize only a handful of string
> operations (memcpy/memset/etc.) and use composable implementation instead
> (as generic strncpy).  It still resulted on 1a153e47fcc, but I think we
> might do better for powerpc.
> 
> [1] https://sourceware.org/pipermail/libc-alpha/2020-September/118049.html
> 

Best Regards,
-- 
Raphael Moreira Zinsly
IBM
Linux on Power Toolchain


More information about the Libc-alpha mailing list