This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

Re: [PATCH] Improves __ieee754_exp() performance by 2x-5x on sparc/x86.

From: Patrick McGehearty <patrick dot mcgehearty at oracle dot com>
To: Joseph Myers <joseph at codesourcery dot com>
Cc: libc-alpha at sourceware dot org
Date: Wed, 3 Jan 2018 16:22:26 -0600
Subject: Re: [PATCH] Improves __ieee754_exp() performance by 2x-5x on sparc/x86.
Authentication-results: sourceware.org; auth=none
References: <1515007683-3161-1-git-send-email-patrick.mcgehearty@oracle.com> <alpine.DEB.2.20.1801032051260.17329@digraph.polyomino.org.uk>

On 1/3/2018 2:57 PM, Joseph Myers wrote:

On Wed, 3 Jan 2018, Patrick McGehearty wrote:

Using the glibc perf tests on sparc,
       sparc (nsec)    x86 (nsec)
       old     new     old     new
max   17629   395    5173     144
min     399    54      15      13
mean   5317   200    1349      23

These are exactly the same figures as in the previous version that was
reverted.  Are you *sure* they are the correct results of retesting with
this patch version?  Remember, the patch submission message must be
correct for the patch version being submitted and there must be an
unambiguous division between the content intended for the commit message
(describing the current patch version) and any other content such as
descriptions of differences from previous patch versions (which don't go
in the commit message).

I'd suggest following the conventions used by the Linux kernel, of the
description of the current patch version for the commit message going
*first*, then a line "---", then any other text not for the commit message
(such as information about differences from previous patch versions), then
the diff itself.

In addition, because ieee754_exp() is used by other routines, cexp()
showed test results with very small imaginary input values where the
imaginary portion of the result was off by 3 ulp when in upward
rounding mode, but not in the other rounding modes.  For x86, tgamma
showed a few values where the ulp increased to 6 (max ulp for tgamma
is 5). Sparc tgamma did not show these failures.  I presume the tgamma
differences are due to compiler optimization differences within the
gamma function.The gamma function is known to be difficult to compute
accurately.

Is this still accurate for the current patch version?

If you tested on a platform requiring ulps updates, you should include the
updates from "make regen-ulps" as part of the patch, and explicitly
confirm that, given the patch applied including those ulps updates, all
the math/ tests PASS on platforms that you specify in the patch submission
message.

I have not been updating any of the earlier comments when I responded
to the later comments. I just added the newest comment to the top of
the earlier patch comments to allow easy access to this PATCH history.
If we need another patch, I'll rework the comments as you suggest.

Here is the latest update for the performance tables:

All numbers rerun with the Jan 2, 2018 src code base:
      sparc (nsec)    x86 (nsec)
      old     new     old     new
max   17584   873    5158    1380
min     399    96      15      16
mean   5497   419    1333      24

Modest increases on x86 but roughly doubling on Sparc makes me think
that SET_RESTORE_ROUND has an opportunity for improvement on Sparc.

For the accuracy issue, we no longer see any failures on cexp() or
tgamma(). The tgamma() failure went away with the SET_RESTORE_ROUND
fix. As part of the SET_RESTORE_ROUND fix, I realized I had missed
handling rounding for a special case. When I fixed that, the cexp()
errors went away.

There only test value that fails now is:
Failure: Test: exp (-0x2.c469ep+8)
Result:
 is:          2.1872267636802106e-308   0x0.fba54632dddc0p-1022
 should be:   2.1872267636802101e-308   0x0.fba54632dddbfp-1022
 difference:  4.9406564584124654e-324   0x0.0000000000001p-1022
 ulp       :  1.0000
 max.ulp   :  0.0000

[Side note: this diff is still present with the 256 size table.]

I'm not clear about how "make regen-ulps" is part of the patch process.
I've not seen any discussion of that before.
Do I need to include the changes to the ulp test data as part
of the patch? I had assumed that was handled as a separate
patch after the code changes were merged.

Also, when I run "make check" on the base code for Sparc, I'm getting
some additional ulp failures for cpow, ctan, ctanh and a few others.
These did not occur with my 8th patch submission but did with my
9th patch. They do not occur on the x86 tests.
These ulp failures are unchanged by my exp() code, so I've
concluded they are not related to my work.

- patrick

Follow-Ups:
- Re: [PATCH] Improves __ieee754_exp() performance by 2x-5x on sparc/x86.
  - From: Joseph Myers

References:
- [PATCH] Improves __ieee754_exp() performance by 2x-5x on sparc/x86.
  - From: Patrick McGehearty
- Re: [PATCH] Improves __ieee754_exp() performance by 2x-5x on sparc/x86.
  - From: Joseph Myers

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]