This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH 4/6] Add new log2 implementation


On 27/06/18 16:47, Joseph Myers wrote:
On Wed, 27 Jun 2018, Szabolcs Nagy wrote:
Improvements on Cortex-A72 compared to current glibc master:
latency: 2.0x
thruput: 1.9x

Could you clarify this testing more?  If you were testing on AArch64, this
patch should have resulted in no changes at all to performance, because
AArch64 uses sysdeps/ieee754/dbl-64/wordsize-64/e_log2.c and you're not
changing or removing the wordsize-64 version in this patch.


i built the new code outside of the glibc tree (and used static linking
when benchmarking) so it's new log2 code vs current glibc code, which
indeed is coming from wordsize-64/ (i did this to be able to do faster
iterations on the code during development)

the comparison is not entirely fair: the new code is measured without
error handling wrappers (since it does not require it), but removal of
those will come in separate patches (and may not be ready in time for
glibc 2.28), the wrapper adds around 10% overhead.

i ran the glibc test suite but this means it was not testing the new
code, i'll redo that.

(I expect it would make sense for this patch to remove the wordsize-64
version.  Generically, it might make sense to see if the dbl-64 functions
are actually any better for 32-bit systems than the dbl-64/wordsize-64
versions - if GCC is good enough at generating code for the wordsize-64
versions on 32-bit systems, reducing the number of variants by using some
or all of the wordsize-64 versions also on 32-bit systems might make
sense.)


i'll remove the wordsize-64 variant, the new code should be faster
on both 32bit and 64bit targets.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]