This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.
Index Nav: | [Date Index] [Subject Index] [Author Index] [Thread Index] | |
---|---|---|
Message Nav: | [Date Prev] [Date Next] | [Thread Prev] [Thread Next] |
Other format: | [Raw text] |
Thanks Zanella, I have added my comments below. > > Hi > > Patch looks good in general, however you need to rebase against master > (although the differences are minimal). Some comments below: > > On 19-08-2015 09:51, Dharmakan Rohit Arul Raj wrote: > > Thanks Zanella. > > > > Please find below, patch for optimized implementation of 'memcmp' for > PowerPC e6500 (32-bit & 64-bit) target using Altivec instructions. > > > > * sysdeps/powerpc/bits/hwcap.h: Add macro to identify e6500 target. > > * sysdeps/powerpc/powerpc32/e6500/memcmp.S: New File > > * sysdeps/powerpc/powerpc32/e6500/multiarch/Implies: New File. > > * sysdeps/powerpc/powerpc32/power4/multiarch/ifunc-impl-list.c: > > Add > > memcmp multiarch implementation. > > You need to add the function you are referring to (__libc_ifunc_impl_list). Ok. > > > * sysdeps/powerpc/powerpc32/power4/multiarch/Makefile: Likewise. > > * sysdeps/powerpc/powerpc32/power4/multiarch/memcmp.c: > Likewise. > > * sysdeps/powerpc/powerpc32/power4/multiarch/memcmp-e6500.S: > New File. > > * sysdeps/powerpc/powerpc64/e6500/memcmp.S: New File. > > * sysdeps/powerpc/powerpc64/e6500/multiarch/Implies: New File > > * sysdeps/powerpc/powerpc64/multiarch/ifunc-impl-list.c: Add > > memcmp multiarch implementation. > > Same as before. Ok. > > * sysdeps/powerpc/powerpc64/multiarch/Makefile: Likewise. > > * sysdeps/powerpc/powerpc64/multiarch/memcmp.c: Likewise. > > * sysdeps/powerpc/powerpc64/multiarch/memcmp-e6500.S: New File. > > > > diff -Naur glibc-2.20/sysdeps/powerpc/bits/hwcap.h > > glibc-2.20-e6500-mcmp/sysdeps/powerpc/bits/hwcap.h > > --- glibc-2.20/sysdeps/powerpc/bits/hwcap.h 2014-09-07 > > 03:09:09.000000000 -0500 > > +++ glibc-2.20-e6500-mcmp/sysdeps/powerpc/bits/hwcap.h 2015-08-19 > > +++ 05:48:43.688000596 -0500 > > @@ -64,3 +64,7 @@ > > #define PPC_FEATURE2_HAS_TAR 0x04000000 /* Target Address > >Register */ > > #define PPC_FEATURE2_HAS_VEC_CRYPTO 0x02000000 /* Target > supports > >vector > > instruction. */ > > +/* Identify Freescale Processors. */ #define PPC_FEATURE_E6500 > > +(PPC_FEATURE_64 | PPC_FEATURE_BOOKE | \ > > + PPC_FEATURE_HAS_ALTIVEC | \ > > + PPC_FEATURE_HAS_FPU | > > +PPC_FEATURE_HAS_MMU) <snip> > > diff -Naur > > glibc-2.20/sysdeps/powerpc/powerpc32/power4/multiarch/ifunc-impl-list. > > c > > glibc-2.20-e6500- > mcmp/sysdeps/powerpc/powerpc32/power4/multiarch/ifunc > > -impl-list.c > > --- > > glibc-2.20/sysdeps/powerpc/powerpc32/power4/multiarch/ifunc-impl-list. > > c 2014-09-07 03:09:09.000000000 -0500 > > +++ glibc-2.20-e6500- > mcmp/sysdeps/powerpc/powerpc32/power4/multiarch/i > > +++ func-impl-list.c 2015-08-19 05:51:57.389000503 -0500 > > @@ -34,6 +34,7 @@ > > size_t i = 0; > > > > unsigned long int hwcap = GLRO(dl_hwcap); > > + unsigned long int hwcap2 = GLRO(dl_hwcap2); > > /* hwcap contains only the latest supported ISA, the code checks > >which is > > and fills the previous supported ones. */ > > if (hwcap & PPC_FEATURE_ARCH_2_06) > > @@ -107,6 +108,10 @@ > > IFUNC_IMPL (i, name, memcmp, > > IFUNC_IMPL_ADD (array, i, memcmp, hwcap & > >PPC_FEATURE_HAS_VSX, > > __memcmp_power7) > > + IFUNC_IMPL_ADD (array, i, memcmp, > > + (((hwcap & PPC_FEATURE_E6500) == > > +PPC_FEATURE_E6500) > > + && (hwcap2 & PPC_FEATURE2_HAS_ISEL)), > > + __memcmp_e6500) > > Do you really to check for PPC_FEATURE2_HAS_ISEL? There is no 'isel' usage > in implementation. I added that check to differentiate between e6500 and e600 processors, although the check for ' PPC_FEATURE_64' should be enough to differentiate them. If the macro 'PPC_FEATURE_E6500' doesn't overlap with any of the other PowerPC processors, I will remove them and re-submit the patch. Please let me know. <snip> > > I have tested this implementation for powerpc64le (with some modification > to fix the endian issues) and at least for power7/power8 I only see some > gains in the 'small' and 'medium' path, less then 64 bytes, which I think for > power7 memcmp it can be fixed. > > Sometime ago I try to use AVX/VSX instructions on memcmp to get some > more throughput, but I see to no gain in aligned case neither in unaligned > one (which was worse due the fact it required a lot of instruction to isolate > which byte was the different one, like what you do in L(Vwords_Differ)). > > I do not have much knowledge of e6500 core and its contrainst, but did you > try to power7 memcmp version on it to check if it is worth to use? It is > basically the power4 version without the branch prediction instructions. I have rebased the benchtests numbers with glibc v2.20 master and attached the results. Overall, e6500 numbers does seem to be better. Please let me know your comments. Regards, Rohit
Attachment:
benchtest-64bit-e6500-p7-memcmp.txt
Description: benchtest-64bit-e6500-p7-memcmp.txt
Attachment:
benchtest-32bit-e6500-p7-memcmp.txt
Description: benchtest-32bit-e6500-p7-memcmp.txt
Index Nav: | [Date Index] [Subject Index] [Author Index] [Thread Index] | |
---|---|---|
Message Nav: | [Date Prev] [Date Next] | [Thread Prev] [Thread Next] |