[PATCH v3 2/2] powerpc: Add optimized stpncpy for POWER9
Raphael M Zinsly
rzinsly@linux.ibm.com
Wed Sep 30 14:21:09 GMT 2020
Hi Adhemerval,
On 30/09/2020 10:42, Adhemerval Zanella wrote:
>
>
> On 29/09/2020 12:21, Raphael Moreira Zinsly via Libc-alpha wrote:
>> Add stpncpy support into the POWER9 strncpy.
>
> The benchmark numbers you provided [1] seems to show it is slight worse than
> the generic_strncpy which uses the same strategy as string/strncpy.c
> (which would use VSX instruction through memset/memcpy).
My implementation is always better than the generic_strncpy, almost
three times better in average. And it calls memset as well.
Are you talking about __strncpy_ppc? For some reason it is using
strnlen_ppc instead of the strnlen_power8, but I didn't touch it.
> Did you compare this
> optimization against an implementation that just call power8/9 memset/memcpy
> instead?
>
Not sure if I understand, isn't that generic_strncpy and strncpy_ppc?
> It should resulting a smaller implementation which reduces i-cache size and
> the code is much more simpler and maintainable. The same applies for stpncpy.
>
> I tried to dissuade Intel developers that such micro-optimization are not
> really a real gain and instead we should optimize only a handful of string
> operations (memcpy/memset/etc.) and use composable implementation instead
> (as generic strncpy). It still resulted on 1a153e47fcc, but I think we
> might do better for powerpc.
>
> [1] https://sourceware.org/pipermail/libc-alpha/2020-September/118049.html
>
Best Regards,
--
Raphael Moreira Zinsly
IBM
Linux on Power Toolchain
More information about the Libc-alpha
mailing list