This is the mail archive of the
`libc-alpha@sourceware.org`
mailing list for the glibc project.

Index Nav: | [Date Index] [Subject Index] [Author Index] [Thread Index] | |
---|---|---|

Message Nav: | [Date Prev] [Date Next] | [Thread Prev] [Thread Next] |

Other format: | [Raw text] |

*From*: Patrick McGehearty <patrick dot mcgehearty at oracle dot com>*To*: Joseph Myers <joseph at codesourcery dot com>*Cc*: libc-alpha at sourceware dot org*Date*: Thu, 30 Nov 2017 18:47:52 -0600*Subject*: Re: [PATCH] Improves __ieee754_exp() performance by greater than 5x on sparc/x86.*Authentication-results*: sourceware.org; auth=none*References*: <1510028685-65660-1-git-send-email-patrick.mcgehearty@oracle.com> <alpine.DEB.2.20.1711232107300.28121@digraph.polyomino.org.uk>

Thank you for the continued detailed reviews. Due to your comments about TBL[2*j] and TBL[2*j+1], I computed exp(x) over 10 million algorithmically generated values for x using both the TBL values used by the Solaris/Studio version of exp() and the TBL values you suggested. There was no case where exp(x) differed. I computed the values for TBL using quad precision and got the same values you recommend. That got me thinking some more and I realized changing from 32 table entries to 64 table entries was really not that difficult. The values for TBL are generated as you recommend. My next patch submission (coming shortly) will use j/64 with 64 TBL entries for TBL[2*j] and TBL[2*j+1]. That approach gives the same performance with fewer ulp errors. On that same 10 million value test, I'm seeing roughly 16 differences per 10,000 values instead of 29 differences per 10,000 values with the 32 TBL entry version. In addition, we only see one difference in test-double-exp.out instead of three. The difference is still a single ulp. I've tested the new version on Sparc and x86. - patrick On 11/23/2017 3:19 PM, Joseph Myers wrote:

On Mon, 6 Nov 2017, Patrick McGehearty wrote:@@ -561,8 +561,10 @@ math-CPPFLAGS += -D__NO_MATH_INLINES -D__LIBC_INTERNAL_MATH_INLINES ifneq ($(long-double-fcts),yes) # The `double' and `long double' types are the same on this machine. # We won't compile the `long double' code at all. Tell the `double' code -# to define aliases for the `FUNCl' names. -math-CPPFLAGS += -DNO_LONG_DOUBLE +# to define aliases for the `FUNCl' names. To avoid type conflicts in +# defining those aliases, tell <math.h> to declare the `FUNCl' names with +# `double' instead of `long double'. +math-CPPFLAGS += -DNO_LONG_DOUBLE -D_Mlong_double_=double endif# These files quiet sNaNs in a way that is optimized away withoutThis diff hunk is bogus (reverting a recent change I made) and should not be included in this patch.+ if (hx < 0x3e300000) + { + retval = one + xx.x; + return (retval);No parentheses around return value.+ } + retval = one + xx.x * (one + half * xx.x); + return (retval);Likewise.+ yy.y = xx.x + (t * (half + xx.x * t2) + + (t * t) * (t3 + xx.x * t4 + t * t5));Split lines before an operator, not after.+ yy.y = xx.x + (t * (half + xx.x * t2) + + (t * t) * (t3 + xx.x * t4 + t * t5));Likewise.+ yy.y = z + (t * (half + (z * t2)) + + (t * t) * (t3 + z * t4 + t * t5));Likewise.+ yy.y = z + (t * (half + (z * t2)) + + (t * t) * (t3 + z * t4 + t * t5));Likewise.+ return (retval);Avoid parentheses around return value.+ if (ix == 0xfff00000 && xx.i_part[LOW_HALF] == 0) + return (zero); /* exp(-inf) = 0. */Likewise.+ return (xx.x * xx.x); /* exp(nan/inf) is nan or inf. */Likewise.+ yy.y = z + (t * (half + z * t2) + + (t * t) * (t3 + z * t4 + t * t5));Split line before operator.+ yy.y = z + (t * (half + z * t2) + + (t * t) * (t3 + z * t4 + t * t5));Likewise.+ return (yy.y);Remove parentheses./* EXP function tables - for use in ocmputing double precisoin exponentials/ocmputing/computing/ s/precisoin/precision/+/* TBL[2*j] and TBL[2*j+1] are double precision numbers used to + approximate exp(x) using the formula given in the comments + for e_exp.c. */I believe the correct semantics to describe are: TBL[2*j] is 2**(j/32), rounded to nearest; TBL[2*j+1] is 2**(j/32) - TBL[2*j], rounded to nearest. Now if that's the case, three of the low parts should be adjusted by 1ulp because the current values aren't actually rounded to nearest (unless you have some concrete reason why the present values, that aren't rounded to nearest, are optimal):+ 0x1.0b5586cf9890fp+0, 0x1.8a62e4adc610ap-54,0x1.8a62e4adc610ap-54 should be 0x1.8a62e4adc610bp-54.+ 0x1.5342b569d4f82p+0, -0x1.07abe1db13cacp-55,-0x1.07abe1db13cacp-55 should be -0x1.07abe1db13cadp-55.+ 0x1.d5818dcfba487p+0, 0x1.2ed02d75b3706p-55,0x1.2ed02d75b3706p-55 should be 0x1.2ed02d75b3707p-55.

Index Nav: | [Date Index] [Subject Index] [Author Index] [Thread Index] | |
---|---|---|

Message Nav: | [Date Prev] [Date Next] | [Thread Prev] [Thread Next] |