This is the mail archive of the
libc-alpha@sourceware.org
mailing list for the glibc project.
Re: Fix atan / atan2 missing underflows (bug 15319) [committed].
- From: Szabolcs Nagy <nsz at port70 dot net>
- To: Joseph Myers <joseph at codesourcery dot com>
- Cc: libc-alpha at sourceware dot org
- Date: Wed, 18 Feb 2015 22:47:15 +0100
- Subject: Re: Fix atan / atan2 missing underflows (bug 15319) [committed].
- Authentication-results: sourceware.org; auth=none
- References: <alpine dot DEB dot 2 dot 10 dot 1502182111160 dot 24016 at digraph dot polyomino dot org dot uk>
* Joseph Myers <joseph@codesourcery.com> [2015-02-18 21:11:57 +0000]:
> I saw without that are similar to those Carlos reported for other
> functions, where I haven't seen a response to
> <https://sourceware.org/ml/libc-alpha/2015-01/msg00485.html>
> confirming if my diagnosis is correct. Arguably all libm functions
> with float and double returns should remove excess range and
> precision, but that's a separate matter.)
>
i think excess precision on return is useful feature for x87
and sad that c11 silently changed the semantics in a backward
incompatible way (annex f now requires that return removes
excess precision, it used to require the opposite, so it
breaks existing code, in musl i'm following c99 for now)
> + .p2align 3
> + .type dbl_min,@object
> +dbl_min: .byte 0, 0, 0, 0, 0, 0, 0x10, 0
> + ASM_SIZE_DIRECTIVE(dbl_min)
> +
> +#ifdef PIC
> +# define MO(op) op##@GOTOFF(%ecx)
> +#else
> +# define MO(op) op
> +#endif
> +
> + .text
> ENTRY(__ieee754_atan2)
> +#ifdef PIC
> + LOAD_PIC_REG (cx)
> +#endif
> fldl 4(%esp)
> fldl 12(%esp)
> fpatan
> - ret
> + fldl MO(dbl_min)
> + fld %st(1)
> + fabs
> + fucompp
> + fnstsw
> + sahf
> + jnc 1f
> + subl $8, %esp
> + cfi_adjust_cfa_offset (8)
> + fld %st(0)
> + fmul %st(0)
> + fstpl (%esp)
> + fstpl (%esp)
> + fldl (%esp)
> + addl $8, %esp
> + cfi_adjust_cfa_offset (-8)
> +1: ret
fwiw, i solved the underflow raising in musl libc with
fsts 4(%esp)
(simply storing the result as single precision float raises underflow)
(i did the subnormal check differently too, but i dont remember
if there was a preformance difference or just less instructions)