faster expf128
Paul Zimmermann
Paul.Zimmermann@inria.fr
Wed Jun 24 06:22:22 GMT 2020
Dear Paul,
thank you for your feedback.
> From: Paul E Murphy <murphyp@linux.ibm.com>
> Date: Mon, 22 Jun 2020 08:59:08 -0500
>
> On 6/22/20 6:02 AM, Paul Zimmermann wrote:
> > I have written some expf128 for x86_64 that is more than 10 times faster than
> > the current glibc/libquadmath code [1] (see slide 21 of [2]).
>
> I would highly recommend running the benchmarks against ppc64le or s390x
> before replacing the existing implementation. I think it would improve
> the code to have more explicit separation between implementations
> optimized for soft and hardfp if performance cannot be rectified. I
> think much of the float128 support assumes the underlying machine does
> not natively support binary128.
I forgot to say my code is intended mainly for machines that do not provide
hardware float128 support. However I did compare with the glibc
expf128 on gcc135.fsffrance.org (ppc64le GNU/Linux) and below are the
results. You can reproduce them with the code from [1]. We see that
my implementation is about 27% faster, but slightly less accurate
(999585 instead of 999999 correct rounding over 1000000). One caveat
though: I did not find how to efficiently set the inexact flag, thus
it is not set in my code.
glibc function (with hardware float128):
[zimmerma@gcc135 ~]$ /opt/at12.0/bin/gcc -DUSE_GLIBC -DNO_WARN_X86_INTRINSICS -O3 main.c expf128.c -lm -lmpfr -lgmp
[zimmerma@gcc135 ~]$ ./a.out
GNU libc version: 2.28
GNU libc release: stable
correct roundings: 999999/1000000 max err=1 ulp(s)
maximal error for
x=-4.2166924211009987727735597908208042e+00
y=1.47473419221889191873789731438093288e-02
z=1.47473419221889191873789731438093303e-02
[zimmerma@gcc135 ~]$ /opt/at12.0/bin/gcc -DTIMINGS -DUSE_GLIBC -DNO_WARN_X86_INTRINSICS -O3 main.c expf128.c -lm -lmpfr -lgmp
[zimmerma@gcc135 ~]$ time ./a.out
GNU libc version: 2.28
GNU libc release: stable
s=1.09651217175878924483994909720534935e+09
real 0m0.195s
user 0m0.194s
sys 0m0.000s
my implementation:
[zimmerma@gcc135 ~]$ /opt/at12.0/bin/gcc -DNO_WARN_X86_INTRINSICS -O3 main.c expf128.c -lm -lmpfr -lgmp
[zimmerma@gcc135 ~]$ ./a.out
correct roundings: 999585/1000000 max err=1 ulp(s)
maximal error for
x=-9.88703896394271837099996910948152675e+00
y=5.08292305698879224291515174794000669e-05
z=5.08292305698879224291515174794000728e-05
[zimmerma@gcc135 ~]$ /opt/at12.0/bin/gcc -DTIMINGS -DNO_WARN_X86_INTRINSICS -O3 main.c expf128.c -lm -lmpfr -lgmp
[zimmerma@gcc135 ~]$ time ./a.out
s=1.09651217175878924483994909720534935e+09
real 0m0.143s
user 0m0.142s
sys 0m0.000s
> > Before making a proper patch for glibc, I'd like to make sure it fits the
> > glibc requirements. In particular, the table size is 16kb. Is that ok?
> > If too large, what table size would be ok?
>
> I think that is acceptable. The current tables for expf128 probably
> aren't much smaller, if I recall correctly.
ok, then I will prepare a patch, once glibc 2.32 is out.
Best regards,
Paul
[1] https://homepages.loria.fr/PZimmermann/glibc-contrib/
More information about the Libc-alpha
mailing list