This is the mail archive of the
libc-alpha@sourceware.org
mailing list for the glibc project.
[PATCH] Use __sqr instead of __mul wherever possible
- From: Siddhesh Poyarekar <siddhesh at redhat dot com>
- To: libc-alpha at sourceware dot org
- Date: Wed, 13 Feb 2013 20:20:46 +0530
- Subject: [PATCH] Use __sqr instead of __mul wherever possible
Hi,
As AJ suggested, here's the patch to make callers use __sqr instead of
__mul wherever possible, since the former is faster. Verified that
this does not cause any regressions on x86_64. OK to commit once the
__sqr patch is acked for powerpc?
Siddhesh
* sysdeps/ieee754/dbl-64/mpatan.c (__mpatan): Use __sqr
instead of __mul.
* sysdeps/ieee754/dbl-64/mpsqrt.c (__mpsqrt): Likewise.
* sysdeps/ieee754/dbl-64/sincos32.c (ss32): Likewise.
(cc32): Likewise.
diff --git a/sysdeps/ieee754/dbl-64/mpatan.c b/sysdeps/ieee754/dbl-64/mpatan.c
index db58680..0f5a24a 100644
--- a/sysdeps/ieee754/dbl-64/mpatan.c
+++ b/sysdeps/ieee754/dbl-64/mpatan.c
@@ -66,7 +66,7 @@ __mpatan(mp_no *x, mp_no *y, int p) {
mptwoim1.d[0] = ONE;
/* Reduce x m times */
- __mul(x,x,&mpsm,p);
+ __sqr(x,&mpsm,p);
if (m==0) __cpy(x,&mps,p);
else {
for (i=0; i<m; i++) {
diff --git a/sysdeps/ieee754/dbl-64/mpsqrt.c b/sysdeps/ieee754/dbl-64/mpsqrt.c
index 65df9fd..941a4e9 100644
--- a/sysdeps/ieee754/dbl-64/mpsqrt.c
+++ b/sysdeps/ieee754/dbl-64/mpsqrt.c
@@ -63,7 +63,7 @@ __mpsqrt(mp_no *x, mp_no *y, int p) {
m=__mpsqrt_mp[p];
for (i=0; i<m; i++) {
- __mul(&mpu,&mpu,&mpt1,p);
+ __sqr(&mpu,&mpt1,p);
__mul(&mpt1,&mpz,&mpt2,p);
__sub(&mp3halfs,&mpt2,&mpt1,p);
__mul(&mpu,&mpt1,&mpt2,p);
diff --git a/sysdeps/ieee754/dbl-64/sincos32.c b/sysdeps/ieee754/dbl-64/sincos32.c
index 6c5ffde..5a8f1bd 100644
--- a/sysdeps/ieee754/dbl-64/sincos32.c
+++ b/sysdeps/ieee754/dbl-64/sincos32.c
@@ -67,7 +67,7 @@ ss32(mp_no *x, mp_no *y, int p) {
#endif
for (i=1;i<=p;i++) mpk.d[i]=0;
- __mul(x,x,&x2,p);
+ __sqr(x,&x2,p);
__cpy(&oofac27,&gor,p);
__cpy(&gor,&sum,p);
for (a=27.0;a>1.0;a-=2.0) {
@@ -99,7 +99,7 @@ cc32(mp_no *x, mp_no *y, int p) {
#endif
for (i=1;i<=p;i++) mpk.d[i]=0;
- __mul(x,x,&x2,p);
+ __sqr(x,&x2,p);
mpk.d[1]=27.0;
__mul(&oofac27,&mpk,&gor,p);
__cpy(&gor,&sum,p);