]>
Commit | Line | Data |
---|---|---|
6e953631 UD |
1 | The following functions for the `long double' versions of the libm |
2 | function have to be written: | |
3 | ||
4 | e_acosl.c | |
5 | e_asinl.c | |
6 | e_atan2l.c | |
6e953631 UD |
7 | e_expl.c |
8 | e_fmodl.c | |
9 | e_hypotl.c | |
10 | e_j0l.c | |
11 | e_j1l.c | |
12 | e_jnl.c | |
13 | e_lgammal_r.c | |
14 | e_logl.c | |
15 | e_log10l.c | |
16 | e_powl.c | |
17 | e_rem_pio2l.c | |
18 | e_sinhl.c | |
19 | e_sqrtl.c | |
20 | ||
21 | k_cosl.c | |
22 | k_rem_pio2l.c | |
23 | k_sinl.c | |
24 | k_tanl.c | |
25 | ||
26 | s_atanl.c | |
6e953631 UD |
27 | s_erfl.c |
28 | s_expm1l.c | |
29 | s_log1pl.c | |
2c6fe0bd UD |
30 | |
31 | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | |
32 | Methods | |
33 | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | |
34 | arcsin | |
35 | ~~~~~~ | |
36 | * Since asin(x) = x + x^3/6 + x^5*3/40 + x^7*15/336 + ... | |
37 | * we approximate asin(x) on [0,0.5] by | |
38 | * asin(x) = x + x*x^2*R(x^2) | |
39 | * where | |
40 | * R(x^2) is a rational approximation of (asin(x)-x)/x^3 | |
41 | * and its remez error is bounded by | |
42 | * |(asin(x)-x)/x^3 - R(x^2)| < 2^(-58.75) | |
43 | * | |
44 | * For x in [0.5,1] | |
45 | * asin(x) = pi/2-2*asin(sqrt((1-x)/2)) | |
46 | * Let y = (1-x), z = y/2, s := sqrt(z), and pio2_hi+pio2_lo=pi/2; | |
47 | * then for x>0.98 | |
48 | * asin(x) = pi/2 - 2*(s+s*z*R(z)) | |
49 | * = pio2_hi - (2*(s+s*z*R(z)) - pio2_lo) | |
50 | * For x<=0.98, let pio4_hi = pio2_hi/2, then | |
51 | * f = hi part of s; | |
52 | * c = sqrt(z) - f = (z-f*f)/(s+f) ...f+c=sqrt(z) | |
53 | * and | |
54 | * asin(x) = pi/2 - 2*(s+s*z*R(z)) | |
55 | * = pio4_hi+(pio4-2s)-(2s*z*R(z)-pio2_lo) | |
56 | * = pio4_hi+(pio4-2f)-(2s*z*R(z)-(pio2_lo+2c)) | |
57 | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | |
58 | arccos | |
59 | ~~~~~~ | |
60 | * Method : | |
61 | * acos(x) = pi/2 - asin(x) | |
62 | * acos(-x) = pi/2 + asin(x) | |
63 | * For |x|<=0.5 | |
64 | * acos(x) = pi/2 - (x + x*x^2*R(x^2)) (see asin.c) | |
65 | * For x>0.5 | |
66 | * acos(x) = pi/2 - (pi/2 - 2asin(sqrt((1-x)/2))) | |
67 | * = 2asin(sqrt((1-x)/2)) | |
68 | * = 2s + 2s*z*R(z) ...z=(1-x)/2, s=sqrt(z) | |
69 | * = 2f + (2c + 2s*z*R(z)) | |
70 | * where f=hi part of s, and c = (z-f*f)/(s+f) is the correction term | |
71 | * for f so that f+c ~ sqrt(z). | |
72 | * For x<-0.5 | |
73 | * acos(x) = pi - 2asin(sqrt((1-|x|)/2)) | |
74 | * = pi - 0.5*(s+s*z*R(z)), where z=(1-|x|)/2,s=sqrt(z) | |
75 | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | |
76 | atan2 | |
77 | ~~~~~ | |
78 | * Method : | |
79 | * 1. Reduce y to positive by atan2(y,x)=-atan2(-y,x). | |
80 | * 2. Reduce x to positive by (if x and y are unexceptional): | |
81 | * ARG (x+iy) = arctan(y/x) ... if x > 0, | |
82 | * ARG (x+iy) = pi - arctan[y/(-x)] ... if x < 0, | |
83 | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | |
84 | atan | |
85 | ~~~~ | |
86 | * Method | |
87 | * 1. Reduce x to positive by atan(x) = -atan(-x). | |
88 | * 2. According to the integer k=4t+0.25 chopped, t=x, the argument | |
89 | * is further reduced to one of the following intervals and the | |
90 | * arctangent of t is evaluated by the corresponding formula: | |
91 | * | |
92 | * [0,7/16] atan(x) = t-t^3*(a1+t^2*(a2+...(a10+t^2*a11)...) | |
93 | * [7/16,11/16] atan(x) = atan(1/2) + atan( (t-0.5)/(1+t/2) ) | |
94 | * [11/16.19/16] atan(x) = atan( 1 ) + atan( (t-1)/(1+t) ) | |
95 | * [19/16,39/16] atan(x) = atan(3/2) + atan( (t-1.5)/(1+1.5t) ) | |
96 | * [39/16,INF] atan(x) = atan(INF) + atan( -1/t ) | |
97 | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | |
98 | exp | |
99 | ~~~ | |
100 | * Method | |
101 | * 1. Argument reduction: | |
102 | * Reduce x to an r so that |r| <= 0.5*ln2 ~ 0.34658. | |
103 | * Given x, find r and integer k such that | |
104 | * | |
105 | * x = k*ln2 + r, |r| <= 0.5*ln2. | |
106 | * | |
107 | * Here r will be represented as r = hi-lo for better | |
108 | * accuracy. | |
109 | * | |
110 | * 2. Approximation of exp(r) by a special rational function on | |
111 | * the interval [0,0.34658]: | |
112 | * Write | |
113 | * R(r**2) = r*(exp(r)+1)/(exp(r)-1) = 2 + r*r/6 - r**4/360 + ... | |
114 | * We use a special Reme algorithm on [0,0.34658] to generate | |
115 | * a polynomial of degree 5 to approximate R. The maximum error | |
116 | * of this polynomial approximation is bounded by 2**-59. In | |
117 | * other words, | |
118 | * R(z) ~ 2.0 + P1*z + P2*z**2 + P3*z**3 + P4*z**4 + P5*z**5 | |
119 | * (where z=r*r, and the values of P1 to P5 are listed below) | |
120 | * and | |
121 | * | 5 | -59 | |
122 | * | 2.0+P1*z+...+P5*z - R(z) | <= 2 | |
123 | * | | | |
124 | * The computation of exp(r) thus becomes | |
125 | * 2*r | |
126 | * exp(r) = 1 + ------- | |
127 | * R - r | |
128 | * r*R1(r) | |
129 | * = 1 + r + ----------- (for better accuracy) | |
130 | * 2 - R1(r) | |
131 | * where | |
132 | * 2 4 10 | |
133 | * R1(r) = r - (P1*r + P2*r + ... + P5*r ). | |
134 | * | |
135 | * 3. Scale back to obtain exp(x): | |
136 | * From step 1, we have | |
137 | * exp(x) = 2^k * exp(r) | |
138 | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | |
139 | hypot | |
140 | ~~~~~ | |
141 | * If (assume round-to-nearest) z=x*x+y*y | |
142 | * has error less than sqrt(2)/2 ulp, than | |
143 | * sqrt(z) has error less than 1 ulp (exercise). | |
144 | * | |
145 | * So, compute sqrt(x*x+y*y) with some care as | |
146 | * follows to get the error below 1 ulp: | |
147 | * | |
148 | * Assume x>y>0; | |
149 | * (if possible, set rounding to round-to-nearest) | |
150 | * 1. if x > 2y use | |
151 | * x1*x1+(y*y+(x2*(x+x1))) for x*x+y*y | |
152 | * where x1 = x with lower 32 bits cleared, x2 = x-x1; else | |
153 | * 2. if x <= 2y use | |
154 | * t1*y1+((x-y)*(x-y)+(t1*y2+t2*y)) | |
155 | * where t1 = 2x with lower 32 bits cleared, t2 = 2x-t1, | |
156 | * y1= y with lower 32 bits chopped, y2 = y-y1. | |
157 | * | |
158 | * NOTE: scaling may be necessary if some argument is too | |
159 | * large or too tiny | |
160 | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | |
161 | j0/y0 | |
162 | ~~~~~ | |
163 | * Method -- j0(x): | |
164 | * 1. For tiny x, we use j0(x) = 1 - x^2/4 + x^4/64 - ... | |
165 | * 2. Reduce x to |x| since j0(x)=j0(-x), and | |
166 | * for x in (0,2) | |
167 | * j0(x) = 1-z/4+ z^2*R0/S0, where z = x*x; | |
168 | * (precision: |j0-1+z/4-z^2R0/S0 |<2**-63.67 ) | |
169 | * for x in (2,inf) | |
170 | * j0(x) = sqrt(2/(pi*x))*(p0(x)*cos(x0)-q0(x)*sin(x0)) | |
171 | * where x0 = x-pi/4. It is better to compute sin(x0),cos(x0) | |
172 | * as follow: | |
173 | * cos(x0) = cos(x)cos(pi/4)+sin(x)sin(pi/4) | |
174 | * = 1/sqrt(2) * (cos(x) + sin(x)) | |
175 | * sin(x0) = sin(x)cos(pi/4)-cos(x)sin(pi/4) | |
176 | * = 1/sqrt(2) * (sin(x) - cos(x)) | |
177 | * (To avoid cancellation, use | |
178 | * sin(x) +- cos(x) = -cos(2x)/(sin(x) -+ cos(x)) | |
179 | * to compute the worse one.) | |
180 | * | |
181 | * Method -- y0(x): | |
182 | * 1. For x<2. | |
183 | * Since | |
184 | * y0(x) = 2/pi*(j0(x)*(ln(x/2)+Euler) + x^2/4 - ...) | |
185 | * therefore y0(x)-2/pi*j0(x)*ln(x) is an even function. | |
186 | * We use the following function to approximate y0, | |
187 | * y0(x) = U(z)/V(z) + (2/pi)*(j0(x)*ln(x)), z= x^2 | |
188 | * where | |
189 | * U(z) = u00 + u01*z + ... + u06*z^6 | |
190 | * V(z) = 1 + v01*z + ... + v04*z^4 | |
191 | * with absolute approximation error bounded by 2**-72. | |
192 | * Note: For tiny x, U/V = u0 and j0(x)~1, hence | |
193 | * y0(tiny) = u0 + (2/pi)*ln(tiny), (choose tiny<2**-27) | |
194 | * 2. For x>=2. | |
195 | * y0(x) = sqrt(2/(pi*x))*(p0(x)*cos(x0)+q0(x)*sin(x0)) | |
196 | * where x0 = x-pi/4. It is better to compute sin(x0),cos(x0) | |
197 | * by the method mentioned above. | |
198 | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | |
199 | j1/y1 | |
200 | ~~~~~ | |
201 | * Method -- j1(x): | |
202 | * 1. For tiny x, we use j1(x) = x/2 - x^3/16 + x^5/384 - ... | |
203 | * 2. Reduce x to |x| since j1(x)=-j1(-x), and | |
204 | * for x in (0,2) | |
205 | * j1(x) = x/2 + x*z*R0/S0, where z = x*x; | |
206 | * (precision: |j1/x - 1/2 - R0/S0 |<2**-61.51 ) | |
207 | * for x in (2,inf) | |
208 | * j1(x) = sqrt(2/(pi*x))*(p1(x)*cos(x1)-q1(x)*sin(x1)) | |
209 | * y1(x) = sqrt(2/(pi*x))*(p1(x)*sin(x1)+q1(x)*cos(x1)) | |
210 | * where x1 = x-3*pi/4. It is better to compute sin(x1),cos(x1) | |
211 | * as follow: | |
212 | * cos(x1) = cos(x)cos(3pi/4)+sin(x)sin(3pi/4) | |
213 | * = 1/sqrt(2) * (sin(x) - cos(x)) | |
214 | * sin(x1) = sin(x)cos(3pi/4)-cos(x)sin(3pi/4) | |
215 | * = -1/sqrt(2) * (sin(x) + cos(x)) | |
216 | * (To avoid cancellation, use | |
217 | * sin(x) +- cos(x) = -cos(2x)/(sin(x) -+ cos(x)) | |
218 | * to compute the worse one.) | |
219 | * | |
220 | * Method -- y1(x): | |
221 | * 1. screen out x<=0 cases: y1(0)=-inf, y1(x<0)=NaN | |
222 | * 2. For x<2. | |
223 | * Since | |
224 | * y1(x) = 2/pi*(j1(x)*(ln(x/2)+Euler)-1/x-x/2+5/64*x^3-...) | |
225 | * therefore y1(x)-2/pi*j1(x)*ln(x)-1/x is an odd function. | |
226 | * We use the following function to approximate y1, | |
227 | * y1(x) = x*U(z)/V(z) + (2/pi)*(j1(x)*ln(x)-1/x), z= x^2 | |
228 | * where for x in [0,2] (abs err less than 2**-65.89) | |
229 | * U(z) = U0[0] + U0[1]*z + ... + U0[4]*z^4 | |
230 | * V(z) = 1 + v0[0]*z + ... + v0[4]*z^5 | |
231 | * Note: For tiny x, 1/x dominate y1 and hence | |
232 | * y1(tiny) = -2/pi/tiny, (choose tiny<2**-54) | |
233 | * 3. For x>=2. | |
234 | * y1(x) = sqrt(2/(pi*x))*(p1(x)*sin(x1)+q1(x)*cos(x1)) | |
235 | * where x1 = x-3*pi/4. It is better to compute sin(x1),cos(x1) | |
236 | * by method mentioned above. | |
237 | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | |
238 | jn/yn | |
239 | ~~~~~ | |
240 | * Note 2. About jn(n,x), yn(n,x) | |
241 | * For n=0, j0(x) is called, | |
242 | * for n=1, j1(x) is called, | |
243 | * for n<x, forward recursion us used starting | |
244 | * from values of j0(x) and j1(x). | |
245 | * for n>x, a continued fraction approximation to | |
246 | * j(n,x)/j(n-1,x) is evaluated and then backward | |
247 | * recursion is used starting from a supposed value | |
248 | * for j(n,x). The resulting value of j(0,x) is | |
249 | * compared with the actual value to correct the | |
250 | * supposed value of j(n,x). | |
251 | * | |
252 | * yn(n,x) is similar in all respects, except | |
253 | * that forward recursion is used for all | |
254 | * values of n>1. | |
255 | ||
256 | jn: | |
257 | /* (x >> n**2) | |
258 | * Jn(x) = cos(x-(2n+1)*pi/4)*sqrt(2/x*pi) | |
259 | * Yn(x) = sin(x-(2n+1)*pi/4)*sqrt(2/x*pi) | |
260 | * Let s=sin(x), c=cos(x), | |
261 | * xn=x-(2n+1)*pi/4, sqt2 = sqrt(2),then | |
262 | * | |
263 | * n sin(xn)*sqt2 cos(xn)*sqt2 | |
264 | * ---------------------------------- | |
265 | * 0 s-c c+s | |
266 | * 1 -s-c -c+s | |
267 | * 2 -s+c -c-s | |
268 | * 3 s+c c-s | |
269 | ... | |
270 | /* x is tiny, return the first Taylor expansion of J(n,x) | |
271 | * J(n,x) = 1/n!*(x/2)^n - ... | |
272 | ... | |
273 | /* use backward recurrence */ | |
274 | /* x x^2 x^2 | |
275 | * J(n,x)/J(n-1,x) = ---- ------ ------ ..... | |
276 | * 2n - 2(n+1) - 2(n+2) | |
277 | * | |
278 | * 1 1 1 | |
279 | * (for large x) = ---- ------ ------ ..... | |
280 | * 2n 2(n+1) 2(n+2) | |
281 | * -- - ------ - ------ - | |
282 | * x x x | |
283 | * | |
284 | * Let w = 2n/x and h=2/x, then the above quotient | |
285 | * is equal to the continued fraction: | |
286 | * 1 | |
287 | * = ----------------------- | |
288 | * 1 | |
289 | * w - ----------------- | |
290 | * 1 | |
291 | * w+h - --------- | |
292 | * w+2h - ... | |
293 | * | |
294 | * To determine how many terms needed, let | |
295 | * Q(0) = w, Q(1) = w(w+h) - 1, | |
296 | * Q(k) = (w+k*h)*Q(k-1) - Q(k-2), | |
297 | * When Q(k) > 1e4 good for single | |
298 | * When Q(k) > 1e9 good for double | |
299 | * When Q(k) > 1e17 good for quadruple | |
300 | ||
301 | ... | |
302 | /* estimate log((2/x)^n*n!) = n*log(2/x)+n*ln(n) | |
303 | * Hence, if n*(log(2n/x)) > ... | |
304 | * single 8.8722839355e+01 | |
305 | * double 7.09782712893383973096e+02 | |
306 | * long double 1.1356523406294143949491931077970765006170e+04 | |
307 | * then recurrent value may overflow and the result is | |
308 | * likely underflow to zero | |
309 | ||
310 | yn: | |
311 | /* (x >> n**2) | |
312 | * Jn(x) = cos(x-(2n+1)*pi/4)*sqrt(2/x*pi) | |
313 | * Yn(x) = sin(x-(2n+1)*pi/4)*sqrt(2/x*pi) | |
314 | * Let s=sin(x), c=cos(x), | |
315 | * xn=x-(2n+1)*pi/4, sqt2 = sqrt(2),then | |
316 | * | |
317 | * n sin(xn)*sqt2 cos(xn)*sqt2 | |
318 | * ---------------------------------- | |
319 | * 0 s-c c+s | |
320 | * 1 -s-c -c+s | |
321 | * 2 -s+c -c-s | |
322 | * 3 s+c c-s | |
323 | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | |
324 | lgamma | |
325 | ~~~~~~ | |
326 | * Method: | |
327 | * 1. Argument Reduction for 0 < x <= 8 | |
328 | * Since gamma(1+s)=s*gamma(s), for x in [0,8], we may | |
329 | * reduce x to a number in [1.5,2.5] by | |
330 | * lgamma(1+s) = log(s) + lgamma(s) | |
331 | * for example, | |
332 | * lgamma(7.3) = log(6.3) + lgamma(6.3) | |
333 | * = log(6.3*5.3) + lgamma(5.3) | |
334 | * = log(6.3*5.3*4.3*3.3*2.3) + lgamma(2.3) | |
335 | * 2. Polynomial approximation of lgamma around its | |
336 | * minimun ymin=1.461632144968362245 to maintain monotonicity. | |
337 | * On [ymin-0.23, ymin+0.27] (i.e., [1.23164,1.73163]), use | |
338 | * Let z = x-ymin; | |
339 | * lgamma(x) = -1.214862905358496078218 + z^2*poly(z) | |
340 | * where | |
341 | * poly(z) is a 14 degree polynomial. | |
342 | * 2. Rational approximation in the primary interval [2,3] | |
343 | * We use the following approximation: | |
344 | * s = x-2.0; | |
345 | * lgamma(x) = 0.5*s + s*P(s)/Q(s) | |
346 | * with accuracy | |
347 | * |P/Q - (lgamma(x)-0.5s)| < 2**-61.71 | |
348 | * Our algorithms are based on the following observation | |
349 | * | |
350 | * zeta(2)-1 2 zeta(3)-1 3 | |
351 | * lgamma(2+s) = s*(1-Euler) + --------- * s - --------- * s + ... | |
352 | * 2 3 | |
353 | * | |
354 | * where Euler = 0.5771... is the Euler constant, which is very | |
355 | * close to 0.5. | |
356 | * | |
357 | * 3. For x>=8, we have | |
358 | * lgamma(x)~(x-0.5)log(x)-x+0.5*log(2pi)+1/(12x)-1/(360x**3)+.... | |
359 | * (better formula: | |
360 | * lgamma(x)~(x-0.5)*(log(x)-1)-.5*(log(2pi)-1) + ...) | |
361 | * Let z = 1/x, then we approximation | |
362 | * f(z) = lgamma(x) - (x-0.5)(log(x)-1) | |
363 | * by | |
364 | * 3 5 11 | |
365 | * w = w0 + w1*z + w2*z + w3*z + ... + w6*z | |
366 | * where | |
367 | * |w - f(z)| < 2**-58.74 | |
368 | * | |
369 | * 4. For negative x, since (G is gamma function) | |
370 | * -x*G(-x)*G(x) = pi/sin(pi*x), | |
371 | * we have | |
372 | * G(x) = pi/(sin(pi*x)*(-x)*G(-x)) | |
373 | * since G(-x) is positive, sign(G(x)) = sign(sin(pi*x)) for x<0 | |
374 | * Hence, for x<0, signgam = sign(sin(pi*x)) and | |
375 | * lgamma(x) = log(|Gamma(x)|) | |
376 | * = log(pi/(|x*sin(pi*x)|)) - lgamma(-x); | |
377 | * Note: one should avoid compute pi*(-x) directly in the | |
378 | * computation of sin(pi*(-x)). | |
379 | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | |
380 | log | |
381 | ~~~ | |
382 | * Method : | |
383 | * 1. Argument Reduction: find k and f such that | |
384 | * x = 2^k * (1+f), | |
385 | * where sqrt(2)/2 < 1+f < sqrt(2) . | |
386 | * | |
387 | * 2. Approximation of log(1+f). | |
388 | * Let s = f/(2+f) ; based on log(1+f) = log(1+s) - log(1-s) | |
389 | * = 2s + 2/3 s**3 + 2/5 s**5 + ....., | |
390 | * = 2s + s*R | |
391 | * We use a special Reme algorithm on [0,0.1716] to generate | |
392 | * a polynomial of degree 14 to approximate R The maximum error | |
393 | * of this polynomial approximation is bounded by 2**-58.45. In | |
394 | * other words, | |
395 | * 2 4 6 8 10 12 14 | |
396 | * R(z) ~ Lg1*s +Lg2*s +Lg3*s +Lg4*s +Lg5*s +Lg6*s +Lg7*s | |
397 | * (the values of Lg1 to Lg7 are listed in the program) | |
398 | * and | |
399 | * | 2 14 | -58.45 | |
400 | * | Lg1*s +...+Lg7*s - R(z) | <= 2 | |
401 | * | | | |
402 | * Note that 2s = f - s*f = f - hfsq + s*hfsq, where hfsq = f*f/2. | |
403 | * In order to guarantee error in log below 1ulp, we compute log | |
404 | * by | |
405 | * log(1+f) = f - s*(f - R) (if f is not too large) | |
406 | * log(1+f) = f - (hfsq - s*(hfsq+R)). (better accuracy) | |
407 | * | |
408 | * 3. Finally, log(x) = k*ln2 + log(1+f). | |
409 | * = k*ln2_hi+(f-(hfsq-(s*(hfsq+R)+k*ln2_lo))) | |
410 | * Here ln2 is split into two floating point number: | |
411 | * ln2_hi + ln2_lo, | |
412 | * where n*ln2_hi is always exact for |n| < 2000. | |
413 | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | |
414 | log10 | |
415 | ~~~~~ | |
416 | * Method : | |
417 | * Let log10_2hi = leading 40 bits of log10(2) and | |
418 | * log10_2lo = log10(2) - log10_2hi, | |
419 | * ivln10 = 1/log(10) rounded. | |
420 | * Then | |
421 | * n = ilogb(x), | |
422 | * if(n<0) n = n+1; | |
423 | * x = scalbn(x,-n); | |
424 | * log10(x) := n*log10_2hi + (n*log10_2lo + ivln10*log(x)) | |
425 | * | |
426 | * Note 1: | |
427 | * To guarantee log10(10**n)=n, where 10**n is normal, the rounding | |
428 | * mode must set to Round-to-Nearest. | |
429 | * Note 2: | |
430 | * [1/log(10)] rounded to 53 bits has error .198 ulps; | |
431 | * log10 is monotonic at all binary break points. | |
432 | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | |
433 | pow | |
434 | ~~~ | |
435 | * Method: Let x = 2 * (1+f) | |
436 | * 1. Compute and return log2(x) in two pieces: | |
437 | * log2(x) = w1 + w2, | |
438 | * where w1 has 53-24 = 29 bit trailing zeros. | |
439 | * 2. Perform y*log2(x) = n+y' by simulating muti-precision | |
440 | * arithmetic, where |y'|<=0.5. | |
441 | * 3. Return x**y = 2**n*exp(y'*log2) | |
442 | * | |
443 | * Special cases: | |
444 | * 1. (anything) ** 0 is 1 | |
445 | * 2. (anything) ** 1 is itself | |
446 | * 3. (anything) ** NAN is NAN | |
447 | * 4. NAN ** (anything except 0) is NAN | |
448 | * 5. +-(|x| > 1) ** +INF is +INF | |
449 | * 6. +-(|x| > 1) ** -INF is +0 | |
450 | * 7. +-(|x| < 1) ** +INF is +0 | |
451 | * 8. +-(|x| < 1) ** -INF is +INF | |
452 | * 9. +-1 ** +-INF is NAN | |
453 | * 10. +0 ** (+anything except 0, NAN) is +0 | |
454 | * 11. -0 ** (+anything except 0, NAN, odd integer) is +0 | |
455 | * 12. +0 ** (-anything except 0, NAN) is +INF | |
456 | * 13. -0 ** (-anything except 0, NAN, odd integer) is +INF | |
457 | * 14. -0 ** (odd integer) = -( +0 ** (odd integer) ) | |
458 | * 15. +INF ** (+anything except 0,NAN) is +INF | |
459 | * 16. +INF ** (-anything except 0,NAN) is +0 | |
460 | * 17. -INF ** (anything) = -0 ** (-anything) | |
461 | * 18. (-anything) ** (integer) is (-1)**(integer)*(+anything**integer) | |
462 | * 19. (-anything except 0 and inf) ** (non-integer) is NAN | |
463 | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | |
464 | rem_pio2 return the remainder of x rem pi/2 in y[0]+y[1] | |
465 | ~~~~~~~~ | |
466 | This is one of the basic functions which is written with highest accuracy | |
467 | in mind. | |
468 | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | |
469 | sinh | |
470 | ~~~~ | |
471 | * Method : | |
472 | * mathematically sinh(x) if defined to be (exp(x)-exp(-x))/2 | |
473 | * 1. Replace x by |x| (sinh(-x) = -sinh(x)). | |
474 | * 2. | |
475 | * E + E/(E+1) | |
476 | * 0 <= x <= 22 : sinh(x) := --------------, E=expm1(x) | |
477 | * 2 | |
478 | * | |
479 | * 22 <= x <= lnovft : sinh(x) := exp(x)/2 | |
480 | * lnovft <= x <= ln2ovft: sinh(x) := exp(x/2)/2 * exp(x/2) | |
481 | * ln2ovft < x : sinh(x) := x*shuge (overflow) | |
482 | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | |
483 | sqrt | |
484 | ~~~~ | |
485 | * Method: | |
486 | * Bit by bit method using integer arithmetic. (Slow, but portable) | |
487 | * 1. Normalization | |
488 | * Scale x to y in [1,4) with even powers of 2: | |
489 | * find an integer k such that 1 <= (y=x*2^(2k)) < 4, then | |
490 | * sqrt(x) = 2^k * sqrt(y) | |
491 | * 2. Bit by bit computation | |
492 | * Let q = sqrt(y) truncated to i bit after binary point (q = 1), | |
493 | * i 0 | |
494 | * i+1 2 | |
495 | * s = 2*q , and y = 2 * ( y - q ). (1) | |
496 | * i i i i | |
497 | * | |
498 | * To compute q from q , one checks whether | |
499 | * i+1 i | |
500 | * | |
501 | * -(i+1) 2 | |
502 | * (q + 2 ) <= y. (2) | |
503 | * i | |
504 | * -(i+1) | |
505 | * If (2) is false, then q = q ; otherwise q = q + 2 . | |
506 | * i+1 i i+1 i | |
507 | * | |
508 | * With some algebric manipulation, it is not difficult to see | |
509 | * that (2) is equivalent to | |
510 | * -(i+1) | |
511 | * s + 2 <= y (3) | |
512 | * i i | |
513 | * | |
514 | * The advantage of (3) is that s and y can be computed by | |
515 | * i i | |
516 | * the following recurrence formula: | |
517 | * if (3) is false | |
518 | * | |
519 | * s = s , y = y ; (4) | |
520 | * i+1 i i+1 i | |
521 | * | |
522 | * otherwise, | |
523 | * -i -(i+1) | |
524 | * s = s + 2 , y = y - s - 2 (5) | |
525 | * i+1 i i+1 i i | |
526 | * | |
527 | * One may easily use induction to prove (4) and (5). | |
528 | * Note. Since the left hand side of (3) contain only i+2 bits, | |
529 | * it does not necessary to do a full (53-bit) comparison | |
530 | * in (3). | |
531 | * 3. Final rounding | |
532 | * After generating the 53 bits result, we compute one more bit. | |
533 | * Together with the remainder, we can decide whether the | |
534 | * result is exact, bigger than 1/2ulp, or less than 1/2ulp | |
535 | * (it will never equal to 1/2ulp). | |
536 | * The rounding mode can be detected by checking whether | |
537 | * huge + tiny is equal to huge, and whether huge - tiny is | |
538 | * equal to huge for some floating point number "huge" and "tiny". | |
539 | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | |
540 | cos | |
541 | ~~~ | |
542 | * kernel cos function on [-pi/4, pi/4], pi/4 ~ 0.785398164 | |
543 | * Input x is assumed to be bounded by ~pi/4 in magnitude. | |
544 | * Input y is the tail of x. | |
545 | * | |
546 | * Algorithm | |
547 | * 1. Since cos(-x) = cos(x), we need only to consider positive x. | |
548 | * 2. if x < 2^-27 (hx<0x3e400000 0), return 1 with inexact if x!=0. | |
549 | * 3. cos(x) is approximated by a polynomial of degree 14 on | |
550 | * [0,pi/4] | |
551 | * 4 14 | |
552 | * cos(x) ~ 1 - x*x/2 + C1*x + ... + C6*x | |
553 | * where the remez error is | |
554 | * | |
555 | * | 2 4 6 8 10 12 14 | -58 | |
556 | * |cos(x)-(1-.5*x +C1*x +C2*x +C3*x +C4*x +C5*x +C6*x )| <= 2 | |
557 | * | | | |
558 | * | |
559 | * 4 6 8 10 12 14 | |
560 | * 4. let r = C1*x +C2*x +C3*x +C4*x +C5*x +C6*x , then | |
561 | * cos(x) = 1 - x*x/2 + r | |
562 | * since cos(x+y) ~ cos(x) - sin(x)*y | |
563 | * ~ cos(x) - x*y, | |
564 | * a correction term is necessary in cos(x) and hence | |
565 | * cos(x+y) = 1 - (x*x/2 - (r - x*y)) | |
566 | * For better accuracy when x > 0.3, let qx = |x|/4 with | |
567 | * the last 32 bits mask off, and if x > 0.78125, let qx = 0.28125. | |
568 | * Then | |
569 | * cos(x+y) = (1-qx) - ((x*x/2-qx) - (r-x*y)). | |
570 | * Note that 1-qx and (x*x/2-qx) is EXACT here, and the | |
571 | * magnitude of the latter is at least a quarter of x*x/2, | |
572 | * thus, reducing the rounding error in the subtraction. | |
573 | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | |
574 | sin | |
575 | ~~~ | |
576 | * kernel sin function on [-pi/4, pi/4], pi/4 ~ 0.7854 | |
577 | * Input x is assumed to be bounded by ~pi/4 in magnitude. | |
578 | * Input y is the tail of x. | |
579 | * Input iy indicates whether y is 0. (if iy=0, y assume to be 0). | |
580 | * | |
581 | * Algorithm | |
582 | * 1. Since sin(-x) = -sin(x), we need only to consider positive x. | |
583 | * 2. if x < 2^-27 (hx<0x3e400000 0), return x with inexact if x!=0. | |
584 | * 3. sin(x) is approximated by a polynomial of degree 13 on | |
585 | * [0,pi/4] | |
586 | * 3 13 | |
587 | * sin(x) ~ x + S1*x + ... + S6*x | |
588 | * where | |
589 | * | |
590 | * |sin(x) 2 4 6 8 10 12 | -58 | |
591 | * |----- - (1+S1*x +S2*x +S3*x +S4*x +S5*x +S6*x )| <= 2 | |
592 | * | x | | |
593 | * | |
594 | * 4. sin(x+y) = sin(x) + sin'(x')*y | |
595 | * ~ sin(x) + (1-x*x/2)*y | |
596 | * For better accuracy, let | |
597 | * 3 2 2 2 2 | |
598 | * r = x *(S2+x *(S3+x *(S4+x *(S5+x *S6)))) | |
599 | * then 3 2 | |
600 | * sin(x) = x + (S1*x + (x *(r-y/2)+y)) | |
601 | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | |
602 | tan | |
603 | ~~~ | |
604 | * kernel tan function on [-pi/4, pi/4], pi/4 ~ 0.7854 | |
605 | * Input x is assumed to be bounded by ~pi/4 in magnitude. | |
606 | * Input y is the tail of x. | |
607 | * Input k indicates whether tan (if k=1) or | |
608 | * -1/tan (if k= -1) is returned. | |
609 | * | |
610 | * Algorithm | |
611 | * 1. Since tan(-x) = -tan(x), we need only to consider positive x. | |
612 | * 2. if x < 2^-28 (hx<0x3e300000 0), return x with inexact if x!=0. | |
613 | * 3. tan(x) is approximated by a odd polynomial of degree 27 on | |
614 | * [0,0.67434] | |
615 | * 3 27 | |
616 | * tan(x) ~ x + T1*x + ... + T13*x | |
617 | * where | |
618 | * | |
619 | * |tan(x) 2 4 26 | -59.2 | |
620 | * |----- - (1+T1*x +T2*x +.... +T13*x )| <= 2 | |
621 | * | x | | |
622 | * | |
623 | * Note: tan(x+y) = tan(x) + tan'(x)*y | |
624 | * ~ tan(x) + (1+x*x)*y | |
625 | * Therefore, for better accuracy in computing tan(x+y), let | |
626 | * 3 2 2 2 2 | |
627 | * r = x *(T2+x *(T3+x *(...+x *(T12+x *T13)))) | |
628 | * then | |
629 | * 3 2 | |
630 | * tan(x+y) = x + (T1*x + (x *(r+y)+y)) | |
631 | * | |
632 | * 4. For x in [0.67434,pi/4], let y = pi/4 - x, then | |
633 | * tan(x) = tan(pi/4-y) = (1-tan(y))/(1+tan(y)) | |
634 | * = 1 - 2*(tan(y) - (tan(y)^2)/(1+tan(y))) | |
635 | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | |
636 | atan | |
637 | ~~~~ | |
638 | * Method | |
639 | * 1. Reduce x to positive by atan(x) = -atan(-x). | |
640 | * 2. According to the integer k=4t+0.25 chopped, t=x, the argument | |
641 | * is further reduced to one of the following intervals and the | |
642 | * arctangent of t is evaluated by the corresponding formula: | |
643 | * | |
644 | * [0,7/16] atan(x) = t-t^3*(a1+t^2*(a2+...(a10+t^2*a11)...) | |
645 | * [7/16,11/16] atan(x) = atan(1/2) + atan( (t-0.5)/(1+t/2) ) | |
646 | * [11/16.19/16] atan(x) = atan( 1 ) + atan( (t-1)/(1+t) ) | |
647 | * [19/16,39/16] atan(x) = atan(3/2) + atan( (t-1.5)/(1+1.5t) ) | |
648 | * [39/16,INF] atan(x) = atan(INF) + atan( -1/t ) | |
649 | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | |
650 | erf | |
651 | ~~~ | |
652 | * x | |
653 | * 2 |\ | |
654 | * erf(x) = --------- | exp(-t*t)dt | |
655 | * sqrt(pi) \| | |
656 | * 0 | |
657 | * | |
658 | * erfc(x) = 1-erf(x) | |
659 | * Note that | |
660 | * erf(-x) = -erf(x) | |
661 | * erfc(-x) = 2 - erfc(x) | |
662 | * | |
663 | * Method: | |
664 | * 1. For |x| in [0, 0.84375] | |
665 | * erf(x) = x + x*R(x^2) | |
666 | * erfc(x) = 1 - erf(x) if x in [-.84375,0.25] | |
667 | * = 0.5 + ((0.5-x)-x*R) if x in [0.25,0.84375] | |
668 | * where R = P/Q where P is an odd poly of degree 8 and | |
669 | * Q is an odd poly of degree 10. | |
670 | * -57.90 | |
671 | * | R - (erf(x)-x)/x | <= 2 | |
672 | * | |
673 | * | |
674 | * Remark. The formula is derived by noting | |
675 | * erf(x) = (2/sqrt(pi))*(x - x^3/3 + x^5/10 - x^7/42 + ....) | |
676 | * and that | |
677 | * 2/sqrt(pi) = 1.128379167095512573896158903121545171688 | |
678 | * is close to one. The interval is chosen because the fix | |
679 | * point of erf(x) is near 0.6174 (i.e., erf(x)=x when x is | |
680 | * near 0.6174), and by some experiment, 0.84375 is chosen to | |
681 | * guarantee the error is less than one ulp for erf. | |
682 | * | |
683 | * 2. For |x| in [0.84375,1.25], let s = |x| - 1, and | |
684 | * c = 0.84506291151 rounded to single (24 bits) | |
685 | * erf(x) = sign(x) * (c + P1(s)/Q1(s)) | |
686 | * erfc(x) = (1-c) - P1(s)/Q1(s) if x > 0 | |
687 | * 1+(c+P1(s)/Q1(s)) if x < 0 | |
688 | * |P1/Q1 - (erf(|x|)-c)| <= 2**-59.06 | |
689 | * Remark: here we use the taylor series expansion at x=1. | |
690 | * erf(1+s) = erf(1) + s*Poly(s) | |
691 | * = 0.845.. + P1(s)/Q1(s) | |
692 | * That is, we use rational approximation to approximate | |
693 | * erf(1+s) - (c = (single)0.84506291151) | |
694 | * Note that |P1/Q1|< 0.078 for x in [0.84375,1.25] | |
695 | * where | |
696 | * P1(s) = degree 6 poly in s | |
697 | * Q1(s) = degree 6 poly in s | |
698 | * | |
699 | * 3. For x in [1.25,1/0.35(~2.857143)], | |
700 | * erfc(x) = (1/x)*exp(-x*x-0.5625+R1/S1) | |
701 | * erf(x) = 1 - erfc(x) | |
702 | * where | |
703 | * R1(z) = degree 7 poly in z, (z=1/x^2) | |
704 | * S1(z) = degree 8 poly in z | |
705 | * | |
706 | * 4. For x in [1/0.35,28] | |
707 | * erfc(x) = (1/x)*exp(-x*x-0.5625+R2/S2) if x > 0 | |
708 | * = 2.0 - (1/x)*exp(-x*x-0.5625+R2/S2) if -6<x<0 | |
709 | * = 2.0 - tiny (if x <= -6) | |
710 | * erf(x) = sign(x)*(1.0 - erfc(x)) if x < 6, else | |
711 | * erf(x) = sign(x)*(1.0 - tiny) | |
712 | * where | |
713 | * R2(z) = degree 6 poly in z, (z=1/x^2) | |
714 | * S2(z) = degree 7 poly in z | |
715 | * | |
716 | * Note1: | |
717 | * To compute exp(-x*x-0.5625+R/S), let s be a single | |
718 | * precision number and s := x; then | |
719 | * -x*x = -s*s + (s-x)*(s+x) | |
720 | * exp(-x*x-0.5626+R/S) = | |
721 | * exp(-s*s-0.5625)*exp((s-x)*(s+x)+R/S); | |
722 | * Note2: | |
723 | * Here 4 and 5 make use of the asymptotic series | |
724 | * exp(-x*x) | |
725 | * erfc(x) ~ ---------- * ( 1 + Poly(1/x^2) ) | |
726 | * x*sqrt(pi) | |
727 | * We use rational approximation to approximate | |
728 | * g(s)=f(1/x^2) = log(erfc(x)*x) - x*x + 0.5625 | |
729 | * Here is the error bound for R1/S1 and R2/S2 | |
730 | * |R1/S1 - f(x)| < 2**(-62.57) | |
731 | * |R2/S2 - f(x)| < 2**(-61.52) | |
732 | * | |
733 | * 5. For inf > x >= 28 | |
734 | * erf(x) = sign(x) *(1 - tiny) (raise inexact) | |
735 | * erfc(x) = tiny*tiny (raise underflow) if x > 0 | |
736 | * = 2 - tiny if x<0 | |
737 | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | |
738 | expm1 Returns exp(x)-1, the exponential of x minus 1 | |
739 | ~~~~~ | |
740 | * Method | |
741 | * 1. Argument reduction: | |
742 | * Given x, find r and integer k such that | |
743 | * | |
744 | * x = k*ln2 + r, |r| <= 0.5*ln2 ~ 0.34658 | |
745 | * | |
746 | * Here a correction term c will be computed to compensate | |
747 | * the error in r when rounded to a floating-point number. | |
748 | * | |
749 | * 2. Approximating expm1(r) by a special rational function on | |
750 | * the interval [0,0.34658]: | |
751 | * Since | |
752 | * r*(exp(r)+1)/(exp(r)-1) = 2+ r^2/6 - r^4/360 + ... | |
753 | * we define R1(r*r) by | |
754 | * r*(exp(r)+1)/(exp(r)-1) = 2+ r^2/6 * R1(r*r) | |
755 | * That is, | |
756 | * R1(r**2) = 6/r *((exp(r)+1)/(exp(r)-1) - 2/r) | |
757 | * = 6/r * ( 1 + 2.0*(1/(exp(r)-1) - 1/r)) | |
758 | * = 1 - r^2/60 + r^4/2520 - r^6/100800 + ... | |
759 | * We use a special Reme algorithm on [0,0.347] to generate | |
760 | * a polynomial of degree 5 in r*r to approximate R1. The | |
761 | * maximum error of this polynomial approximation is bounded | |
762 | * by 2**-61. In other words, | |
763 | * R1(z) ~ 1.0 + Q1*z + Q2*z**2 + Q3*z**3 + Q4*z**4 + Q5*z**5 | |
764 | * where Q1 = -1.6666666666666567384E-2, | |
765 | * Q2 = 3.9682539681370365873E-4, | |
766 | * Q3 = -9.9206344733435987357E-6, | |
767 | * Q4 = 2.5051361420808517002E-7, | |
768 | * Q5 = -6.2843505682382617102E-9; | |
769 | * (where z=r*r, and the values of Q1 to Q5 are listed below) | |
770 | * with error bounded by | |
771 | * | 5 | -61 | |
772 | * | 1.0+Q1*z+...+Q5*z - R1(z) | <= 2 | |
773 | * | | | |
774 | * | |
775 | * expm1(r) = exp(r)-1 is then computed by the following | |
776 | * specific way which minimize the accumulation rounding error: | |
777 | * 2 3 | |
778 | * r r [ 3 - (R1 + R1*r/2) ] | |
779 | * expm1(r) = r + --- + --- * [--------------------] | |
780 | * 2 2 [ 6 - r*(3 - R1*r/2) ] | |
781 | * | |
782 | * To compensate the error in the argument reduction, we use | |
783 | * expm1(r+c) = expm1(r) + c + expm1(r)*c | |
784 | * ~ expm1(r) + c + r*c | |
785 | * Thus c+r*c will be added in as the correction terms for | |
786 | * expm1(r+c). Now rearrange the term to avoid optimization | |
787 | * screw up: | |
788 | * ( 2 2 ) | |
789 | * ({ ( r [ R1 - (3 - R1*r/2) ] ) } r ) | |
790 | * expm1(r+c)~r - ({r*(--- * [--------------------]-c)-c} - --- ) | |
791 | * ({ ( 2 [ 6 - r*(3 - R1*r/2) ] ) } 2 ) | |
792 | * ( ) | |
793 | * | |
794 | * = r - E | |
795 | * 3. Scale back to obtain expm1(x): | |
796 | * From step 1, we have | |
797 | * expm1(x) = either 2^k*[expm1(r)+1] - 1 | |
798 | * = or 2^k*[expm1(r) + (1-2^-k)] | |
799 | * 4. Implementation notes: | |
800 | * (A). To save one multiplication, we scale the coefficient Qi | |
801 | * to Qi*2^i, and replace z by (x^2)/2. | |
802 | * (B). To achieve maximum accuracy, we compute expm1(x) by | |
803 | * (i) if x < -56*ln2, return -1.0, (raise inexact if x!=inf) | |
804 | * (ii) if k=0, return r-E | |
805 | * (iii) if k=-1, return 0.5*(r-E)-0.5 | |
806 | * (iv) if k=1 if r < -0.25, return 2*((r+0.5)- E) | |
807 | * else return 1.0+2.0*(r-E); | |
808 | * (v) if (k<-2||k>56) return 2^k(1-(E-r)) - 1 (or exp(x)-1) | |
809 | * (vi) if k <= 20, return 2^k((1-2^-k)-(E-r)), else | |
810 | * (vii) return 2^k(1-((E+2^-k)-r)) | |
811 | * | |
812 | * Special cases: | |
813 | * expm1(INF) is INF, expm1(NaN) is NaN; | |
814 | * expm1(-INF) is -1, and | |
815 | * for finite argument, only expm1(0)=0 is exact. | |
816 | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | |
817 | log1p | |
818 | ~~~~~ | |
819 | * Method : | |
820 | * 1. Argument Reduction: find k and f such that | |
821 | * 1+x = 2^k * (1+f), | |
822 | * where sqrt(2)/2 < 1+f < sqrt(2) . | |
823 | * | |
824 | * Note. If k=0, then f=x is exact. However, if k!=0, then f | |
825 | * may not be representable exactly. In that case, a correction | |
826 | * term is need. Let u=1+x rounded. Let c = (1+x)-u, then | |
827 | * log(1+x) - log(u) ~ c/u. Thus, we proceed to compute log(u), | |
828 | * and add back the correction term c/u. | |
829 | * (Note: when x > 2**53, one can simply return log(x)) | |
830 | * | |
831 | * 2. Approximation of log1p(f). | |
832 | * Let s = f/(2+f) ; based on log(1+f) = log(1+s) - log(1-s) | |
833 | * = 2s + 2/3 s**3 + 2/5 s**5 + ....., | |
834 | * = 2s + s*R | |
835 | * We use a special Reme algorithm on [0,0.1716] to generate | |
836 | * a polynomial of degree 14 to approximate R The maximum error | |
837 | * of this polynomial approximation is bounded by 2**-58.45. In | |
838 | * other words, | |
839 | * 2 4 6 8 10 12 14 | |
840 | * R(z) ~ Lp1*s +Lp2*s +Lp3*s +Lp4*s +Lp5*s +Lp6*s +Lp7*s | |
841 | * (the values of Lp1 to Lp7 are listed in the program) | |
842 | * and | |
843 | * | 2 14 | -58.45 | |
844 | * | Lp1*s +...+Lp7*s - R(z) | <= 2 | |
845 | * | | | |
846 | * Note that 2s = f - s*f = f - hfsq + s*hfsq, where hfsq = f*f/2. | |
847 | * In order to guarantee error in log below 1ulp, we compute log | |
848 | * by | |
849 | * log1p(f) = f - (hfsq - s*(hfsq+R)). | |
850 | * | |
851 | * 3. Finally, log1p(x) = k*ln2 + log1p(f). | |
852 | * = k*ln2_hi+(f-(hfsq-(s*(hfsq+R)+k*ln2_lo))) | |
853 | * Here ln2 is split into two floating point number: | |
854 | * ln2_hi + ln2_lo, | |
855 | * where n*ln2_hi is always exact for |n| < 2000. | |
856 | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ |