This is the mail archive of the
libc-alpha@sourceware.org
mailing list for the glibc project.
Re: [PATCH 02/10] Improve performance of sincosf
- From: Joseph Myers <joseph at codesourcery dot com>
- To: Szabolcs Nagy <szabolcs dot nagy at arm dot com>
- Cc: GNU C Library <libc-alpha at sourceware dot org>, <nd at arm dot com>, Wilco Dijkstra <Wilco dot Dijkstra at arm dot com>
- Date: Tue, 7 Aug 2018 20:11:05 +0000
- Subject: Re: [PATCH 02/10] Improve performance of sincosf
- References: <d4963b4e-0013-adaf-df2f-698cf421a487@arm.com> <725dd4a7-f002-65da-4e5c-370cb78c3e77@arm.com> <e9ca9a7b-05df-715e-0e4b-b5c80be344b2@arm.com>
On Wed, 11 Jul 2018, Szabolcs Nagy wrote:
> +/* Fast sincosf implementation. Worst-case ULP is 0.56072, maximum relative
> + error is 0.5303p-23. A single-step signed range reduction is used for
I don't think this funny mixture of decimal and hex float notation, not a
valid constant in C, is a good idea. (Saying 0.5303*0x1p-23 or similar
would be fine, if that's what you mean.)
> +#if WANT_ERRNO
> + /* Needed to set errno for +-Inf, the add is a hack to work
> + around a gcc register allocation issue: just passing y
> + affects code generation in the fast path. */
> + __math_invalidf (y + y);
Is there a GCC bug filed for this? A reference to a specific bug number
is much better than "a gcc register allocation issue" for anyone trying to
tell in future if the issue is still relevant or not.
> +/* PI * 2^-64. */
> +static const double pi64 = 0x1.921FB54442D18p-62;
That's actually PI * 2^-63, so the comment and variable name are
misleading.
> + Use round/lround if inlined, otherwise convert to int. To avoid inaccuracies
> + introduced by truncating negative values, compute the quadrant * 2^24. */
I think this last statement is actually describing what the
[!TOINT_INTRINSICS] code does, rather than part of the interface of the
function. So it should be inside that part of the #if inside the
function; the comment above the function should be about what the
function's interface is, not avoid such details of the implementation.
However, the comment relates to *another* interface - that is, what the
semantics of the hpi_inv member of sincos_t are. So I think the
definition of the sincos_t struct should have comments explaining the
semantics of its members, and the one on hpi_inv would explain the
dependence on TOINT_INTRINSICS.
> +/* Reduce the range of XI to a multiple of PI/4 using fast integer arithmetic.
> + XI is a reinterpreted float and must be >= 2.0f (the sign bit is ignored).
> + Return the modulo between -PI/4 and PI/4 and store the quadrant in NP.
Aren't you actually reducing to an interval of length PI/2 (not PI/4)
(it's not really clear to me what "to a multiple of PI/4" is meant to
mean), with the result thus in the range -PI/4 and PI/4?
(Whether __inv_pio4 is really a table of 4/PI or 2/PI is a matter of
semantics for how you say the bits of that number are stored in that
table. But ultimately you end up with a signed 62-bit fractional part of
x*2/PI and so need to multiply by PI/2/2^62, which is why the value of
your pi64 constant above is appropriate for this multiplication even
though the name and comment are wrong.)
--
Joseph S. Myers
joseph@codesourcery.com