This is the mail archive of the mailing list for the glibc project.

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[PATCH] improve cexp performance for imaginary inputs

cexp(x) can avoid the exponential from the exp(x_r) * (cos(x_i) + i sin(x_i))
when the real part of the input is zero.
This is a common enough input to be worth optimizing, e.g. twiddle factors in
fast fourier transforms.

Even though the exp function has a fast path for the zero input case it
does still impose about 15% overhead on the computation of cexp on amd64.
For 10000 uniform imaginary inputs from -pi to pi performance improved
from 350 cycles to 300 cycles per call on an amd phenom 2X4.

In the case of real and imaginary input it just adds one branch to the
already existing 6 branches and sincos computation so it has no
measurable negative effect on amd64 and likely also most other cpus.

Interestingly there is no really significant effect for the same change
in cexpf and cexpl, I assume this is either due to a faster exp or
slower sincos. So only double is changed in this patch.
 math/s_cexp.c | 10 +++++++---
 1 file changed, 7 insertions(+), 3 deletions(-)

diff --git a/math/s_cexp.c b/math/s_cexp.c
index 9116e2b..718523f 100644
--- a/math/s_cexp.c
+++ b/math/s_cexp.c
@@ -70,9 +70,13 @@ __cexp (__complex__ double x)
-	      double exp_val = __ieee754_exp (__real__ x);
-	      __real__ retval = exp_val * cosix;
-	      __imag__ retval = exp_val * sinix;
+	      double exp_val = 1.;
+	      if (__real__ x != 0.)
+	        {
+	          exp_val= __ieee754_exp (__real__ x);
+	        }
+	       __real__ retval = exp_val * cosix;
+	       __imag__ retval = exp_val * sinix;
 	  if (fabs (__real__ retval) < DBL_MIN)

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]