This is the mail archive of the
mailing list for the glibc project.
Re: [PATCH] Improves __ieee754_exp() performance by greater than 5x on sparc/x86.
- From: Siddhesh Poyarekar <siddhesh at gotplt dot org>
- To: Patrick McGehearty <patrick dot mcgehearty at oracle dot com>, libc-alpha at sourceware dot org
- Date: Mon, 11 Dec 2017 13:44:38 +0530
- Subject: Re: [PATCH] Improves __ieee754_exp() performance by greater than 5x on sparc/x86.
- Authentication-results: sourceware.org; auth=none
- References: <firstname.lastname@example.org>
On Saturday 09 December 2017 04:33 AM, Patrick McGehearty wrote:
> +/* IBM exp(x) replaced by following exp(x) in 2017. IBM exp1(x,xx) remains. */
> +/* exp(x)
> + Hybrid algorithm of Peter Tang's Table driven method (for large
> + arguments) and an accurate table (for small arguments).
> + Written by K.C. Ng, November 1988.
> + Revised by Patrick McGehearty, Nov 2017 to use j/64 instead of j/32
> + Method (large arguments):
> + 1. Argument Reduction: given the input x, find r and integer k
> + and j such that
> + x = (k+j/64)*(ln2) + r, |r| <= (1/128)*ln2
> + 2. exp(x) = 2^k * (2^(j/64) + 2^(j/64)*expm1(r))
> + a. expm1(r) is approximated by a polynomial:
> + expm1(r) ~ r + t1*r^2 + t2*r^3 + ... + t5*r^6
> + Here t1 = 1/2 exactly.
> + b. 2^(j/64) is represented to twice double precision
> + as TBL[2j]+TBL[2j+1].
> + Note: If divide were fast enough, we could use another approximation
> + in 2.a:
> + expm1(r) ~ (2r)/(2-R), R = r - r^2*(t1 + t2*r^2)
> + (for the same t1 and t2 as above)
> + Special cases:
> + exp(INF) is INF, exp(NaN) is NaN;
> + exp(-INF)= 0;
> + for finite argument, only exp(0)=1 is exact.
> + Accuracy:
> + According to an error analysis, the error is always less than
> + an ulp (unit in the last place). The largest errors observed
> + are less than 0.55 ulp for normal results and less than 0.75 ulp
> + for subnormal results.
> + Misc. info.
> + For IEEE double
> + if x > 7.09782712893383973096e+02 then exp(x) overflow
> + if x < -7.45133219101941108420e+02 then exp(x) underflow. */
Are you planning to work on the log implementation as well?