This is the mail archive of the
`libc-alpha@sourceware.org`
mailing list for the glibc project.

Index Nav: | [Date Index] [Subject Index] [Author Index] [Thread Index] | |
---|---|---|

Message Nav: | [Date Prev] [Date Next] | [Thread Prev] [Thread Next] |

Other format: | [Raw text] |

*From*: Patrick McGehearty <patrick dot mcgehearty at oracle dot com>*To*: libc-alpha at sourceware dot org*Date*: Mon, 11 Dec 2017 10:59:54 -0600*Subject*: Re: [PATCH] Improves __ieee754_exp() performance by greater than 5x on sparc/x86.*Authentication-results*: sourceware.org; auth=none*References*: <1512774211-43942-1-git-send-email-patrick.mcgehearty@oracle.com> <e4e7a32d-d95f-00f4-3fd2-60fc04a10824@gotplt.org>

On 12/11/2017 2:14 AM, Siddhesh Poyarekar wrote:

On Saturday 09 December 2017 04:33 AM, Patrick McGehearty wrote:+/* IBM exp(x) replaced by following exp(x) in 2017. IBM exp1(x,xx) remains. */ +/* exp(x) + Hybrid algorithm of Peter Tang's Table driven method (for large + arguments) and an accurate table (for small arguments). + Written by K.C. Ng, November 1988. + Revised by Patrick McGehearty, Nov 2017 to use j/64 instead of j/32 + Method (large arguments): + 1. Argument Reduction: given the input x, find r and integer k + and j such that + x = (k+j/64)*(ln2) + r, |r| <= (1/128)*ln2 + + 2. exp(x) = 2^k * (2^(j/64) + 2^(j/64)*expm1(r)) + a. expm1(r) is approximated by a polynomial: + expm1(r) ~ r + t1*r^2 + t2*r^3 + ... + t5*r^6 + Here t1 = 1/2 exactly. + b. 2^(j/64) is represented to twice double precision + as TBL[2j]+TBL[2j+1]. + + Note: If divide were fast enough, we could use another approximation + in 2.a: + expm1(r) ~ (2r)/(2-R), R = r - r^2*(t1 + t2*r^2) + (for the same t1 and t2 as above) + + Special cases: + exp(INF) is INF, exp(NaN) is NaN; + exp(-INF)= 0; + for finite argument, only exp(0)=1 is exact. + + Accuracy: + According to an error analysis, the error is always less than + an ulp (unit in the last place). The largest errors observed + are less than 0.55 ulp for normal results and less than 0.75 ulp + for subnormal results. + + Misc. info. + For IEEE double + if x > 7.09782712893383973096e+02 then exp(x) overflow + if x < -7.45133219101941108420e+02 then exp(x) underflow. */ +Are you planning to work on the log implementation as well? Siddhesh

log, log10, pow, cbrt are on my short list of functions to investigate as these all show significant performance advantage (1.8x or greater) using the Solaris Studio libm functions vs the Linux libm functions in the preliminary testing I did months ago. I intend to take these one at a time, first with a trial port and extensive accuracy and perf tests. Assuming the performance advantage applies across multiple platforms and accuracy does not suffer to an unacceptable degree, I will propose each in turn for patching. "log" is my next target, likely with log10 following close behind. I'll also look at the 32 bit functions to see if they offer similar opportunities, but I haven't written/run those tests yet. I don't have plans to work on 80 bit or 128 bit (long double) functions at this time. As always, the above is not a formal commitment as my management may redirect efforts to meet other corporate goals. My background project right now is supporting Vladimir Mezentsev's work on improving accuracy and range of complex divide. By range, I mean the range of input values which currently cause overflow/underflow but don't need to with appropriate scaling. - patrick

**Follow-Ups**:**Re: [PATCH] Improves __ieee754_exp() performance by greater than 5x on sparc/x86.***From:*Siddhesh Poyarekar

**References**:**[PATCH] Improves __ieee754_exp() performance by greater than 5x on sparc/x86.***From:*Patrick McGehearty

**Re: [PATCH] Improves __ieee754_exp() performance by greater than 5x on sparc/x86.***From:*Siddhesh Poyarekar

Index Nav: | [Date Index] [Subject Index] [Author Index] [Thread Index] | |
---|---|---|

Message Nav: | [Date Prev] [Date Next] | [Thread Prev] [Thread Next] |