Bug 6406 - Improve performance of libm wrapper functions.
Summary: Improve performance of libm wrapper functions.
Status: RESOLVED FIXED
Alias: None
Product: glibc
Classification: Unclassified
Component: math (show other bugs)
Version: unspecified
: P2 enhancement
Target Milestone: ---
Assignee: Ulrich Drepper
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2008-04-15 01:51 UTC by Steven Munroe
Modified: 2014-07-03 11:38 UTC (History)
4 users (show)

See Also:
Host:
Target: powerpc64-*-linux and others
Build:
Last reconfirmed:
fweimer: security-


Attachments
A proposed method for inlining range checks in libm wrappers (2.85 KB, patch)
2008-04-15 02:00 UTC, Steven Munroe
Details | Diff
Example implementation of the x86_64 platform (704 bytes, patch)
2008-04-15 02:27 UTC, Steven Munroe
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Steven Munroe 2008-04-15 01:51:06 UTC
Many math lib functions have a wrapper function in front of the ieee754
implementation. This wrapper function will check for specific error conditions
and set errno as required by C99 or Posix.

This wrapper function can add significant path length and is composed of often
redundant isnan, isinf, finite tests which is incur the call overhead to that
function. Also since we are using the general purpose functions, the same value
is  transfered and manipulated in the same or similar ways multiple times.

Performance could be improved by making these tests inline, allowing the
compiler and optimize out data transfers and common subexpressions.
Comment 1 Steven Munroe 2008-04-15 02:00:35 UTC
Created attachment 2691 [details]
A proposed method for inlining range checks in libm wrappers

This method defines macros in math_private.h that; declare transfer variables
(unions), transfer FP parms to a convenient (int) form for manipulation, 
defines macros for inline versions of ISNAN, ISINF, FINITE, SIGNBIT. 
the the DCL and XFER macros are separate to allow the flexibility for
scheduling.
This version includes the default math/math_private.h, examples uses for
w_exp.c w_log.c and w_pow.c and a platform specific override for powerpc.
Comment 2 Steven Munroe 2008-04-15 02:16:19 UTC
The implementation above returns a performance improvement at the micro
benchmark level of 4% (w_exp), 8% (w_log), and 15% (w_pow) for power5 and power6
systems. based on GCC-4.3 and todays GLIBC cvs.
Comment 3 Steven Munroe 2008-04-15 02:27:17 UTC
Created attachment 2692 [details]
Example implementation of the x86_64 platform

This is an example for x86_64. So far it returns mixed results. the results are
disappointing for w_exp and w_log but an impressive 43% improvement for w_pow.
Not sure why the simpler w_exp and w_log case do not improve. As far as I can
tell the total path is shorter with this test patch.

This is only test and without this patch ./math/math_privated.h from the
previous patch maintains the existing implementation.

I am sure someone more skilled in x86_64 can devise a solution that retains the
gains for w_pow and resolves the problems with w_exp and w_log.
Comment 4 Steven Munroe 2008-04-15 16:07:11 UTC
marked the severity as enhancement
Comment 5 Steven Munroe 2008-06-26 14:12:41 UTC
HJ this is the proposal I mentioned to you at GCC Summit. It way for each
platform to inline the isnan, isinf, finite, etc test in the libm function
wrappers (i.e w_pow.c)

Is uses macros defined in math-private.h to inline the various (redundent) tests
required to set errno (if needed). The generic ./math/math_private.h maintains
the status quo, but a platform can override math_private.h to define macros that
work best for them.
Comment 6 H.J. Lu 2008-06-27 15:34:50 UTC
(In reply to comment #5)
> HJ this is the proposal I mentioned to you at GCC Summit. It way for each
> platform to inline the isnan, isinf, finite, etc test in the libm function
> wrappers (i.e w_pow.c)
> 
> Is uses macros defined in math-private.h to inline the various (redundent) tests
> required to set errno (if needed). The generic ./math/math_private.h maintains
> the status quo, but a platform can override math_private.h to define macros that
> work best for them.

I will take a look at x86-64.
Comment 7 Ulrich Drepper 2011-10-08 08:45:27 UTC
This is horrible.

Provide optimized inline versions of the standard functions and then fix the compiler to reuse the result of function calls which convert the floating-point numbers to the bitmasks.  If the inlined conversion function is marked 'const' then the compiler should be able to determine that subsequent calls to


   bits = float2bits(floatvar);

return exactly the same result and should reuse it (given that floatvar hasn't changed).
Comment 8 Jakub Jelinek 2011-10-08 09:21:01 UTC
For all these 4 functions has gcc its builtins, and if the gcc compiled code using those builtins could be improved, a bug with particular testcases should be filed against gcc.
Comment 9 Ulrich Drepper 2011-10-15 14:13:59 UTC
I'm closing this bug.  I optimized the wrappers *correctly*.  Anything else is up to the compiler and those people who care about the niche architectures.  They have to provide additions to their respective math_private.h.