This is the mail archive of the
mailing list for the glibc project.
Re: [PATCH] v11 Improves __ieee754_exp() performance by greater than 5x on sparc/x86.
- From: Wilco Dijkstra <Wilco dot Dijkstra at arm dot com>
- To: "patrick dot mcgehearty at oracle dot com" <patrick dot mcgehearty at oracle dot com>, "Szabolcs Nagy" <Szabolcs dot Nagy at arm dot com>
- Cc: "libc-alpha at sourceware dot org" <libc-alpha at sourceware dot org>, nd <nd at arm dot com>
- Date: Thu, 8 Feb 2018 11:40:37 +0000
- Subject: Re: [PATCH] v11 Improves __ieee754_exp() performance by greater than 5x on sparc/x86.
- Authentication-results: sourceware.org; auth=none
- Authentication-results: spf=none (sender IP is ) smtp.mailfrom=Wilco dot Dijkstra at arm dot com;
- Nodisclaimer: True
- Spamdiagnosticmetadata: NSPM
- Spamdiagnosticoutput: 1:99
> Has there been a serious discussion in the past of to what degree
> of accuracy glibc/libm should support other rounding modes than
> round-to-nearest? If a concensus decision were made that
> other rounding modes were allowed slightly greater ulp diffs,
> we could remove all the rounding mode checks and get
> faster code. Failing that concensus, I don't see how we
> can bypass the rounding mode checks for the generic code.
There have been various discussions, but nothing conclusive. I believe the
rounding mode changes can be removed from all the key math functions if we
accept 1 extra ULP in non-nearest rounding modes. As Szabolcs mentioned
there are some round-to-int idioms used by math functions which rely on a
specific rounding mode, but we can fix those.
If rounding errors in the more complex functions go up (some are very
sensitive to ULP), we could consider adding the rounding mode changes there -
that means you only do it where absolutely necessary, and also in cases where
the relative overhead is much lower.
Or alternatively we could agree that we don't have a requirement to optimize
math functions for absolute best possible ULP with different rounding modes,
and accept larger ULP errors.
> I'll look into comparing removing the slow path on Sparc and
> x86, including running my own "10 million values" test to
> get a sense of how frequently the slow path is triggered
> and what the largest relative error that test observes.
> I'll also run timing tests.
Yes I noticed that even when the slow path doesn't trigger, it has a significant
overhead (log is 18% faster without the slow paths). Note we'll likely post patches
for removing slow paths in exp, pow as well.