This is the mail archive of the mailing list for the glibc project.

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]


On 12/05/2017 07:29 PM, H.J. Lu wrote:
On Tue, Dec 5, 2017 at 5:32 AM, Joseph Myers<>  wrote:
On Tue, 5 Dec 2017, H.J. Lu wrote:

diff --git a/sysdeps/generic/math_private.h b/sysdeps/generic/math_private.h
index f29898c19c..a2cdce5b6a 100644
--- a/sysdeps/generic/math_private.h
+++ b/sysdeps/generic/math_private.h
@@ -184,6 +184,14 @@ do {                                                             \
  } while (0)

+# define FLOOR_DOUBLE_TO_INT(x) ((int) __floor (x))
+# define FLOOR_INT_TO_DOUBLE_HALF(x) __floor ((x) / 2.0)
These need comments defining their semantics (argument types and ranges,
return values and types).

The existing code is using unsigned int.  You're using int here for the
return type of FLOOR_DOUBLE_TO_INT but still converting to unsigned int in
the caller.  (The semantics in the caller do in fact include that the
argument is positive but in the range of int.)

I'd be surprised if the FLOOR_INT_TO_DOUBLE_HALF definition dividing by
2.0 and calling floor, as opposed to dividing by integer 2 and converting
the result to double (which is just a right shift, given the argument is
unsigned) is optimal anywhere, unless the compiler optimizes it into an
integer shift anyway.  Thus, it would make sense just to use "n / 2"
unconditionally in place of "__floor (n / 2.0)".

Similarly, I'd expect a direct conversion to int (without a call to
__floor) to be at least as good as the present code for
FLOOR_DOUBLE_TO_INT everywhere, given the range constraints that make it
OK to do so.

So I think the natural change would not need to add any new macros, but
would change s_sinf.c unconditionally to do essentially what your patch
does in the x86_64 case.  As the original patch of Rajalakshmi's was
benchmarked on s390, you could ask her to confirm such a change makes
performance no worse there (and possibly improves it).
Here is the patch to replace floor with simple casts.  It improves performance
of max and mean by 30% on x86-64.  Rajalakshmi, can you try it on s390?

Mean value reduced from 22.0 to 20.0 with this patch on s390x.

Rajalakshmi S

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]