This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

RE: [PATCH] Inline C99 math functions


> Joseph Myers wrote:
> On Mon, 15 Jun 2015, Wilco Dijkstra wrote:
> 
> > Add inlining of the C99 math functions
> > isinf/isnan/signbit/isfinite/isnormal/fpclassify using GCC built-ins
> > when available. Since going through the PLT is expensive for these small
> > functions, inlining results in major speedups (about 7x on Cortex-A57
> > for isinf). The GCC built-ins are not correct if signalling NaN support
> 
> Where are the benchmarks for this?  Please put them in benchtests so
> actual reproducible figures can be given.  That's the standard practice
> for any change being justified on the basis of performance.

I'll add a benchmark in another patch - it's not trivial as benchtest is not
suitable to accurately time very simple functions, especially when inlined...

> What are the effects on code size (especially for fpclassify)?  If code
> becomes larger, conditioning on !defined __OPTIMIZE_SIZE__ should be
> considered.

Codesize of what? Few applications use these functions... GLIBC mathlib is 
the main example, but it's not typical as it has extremely high usage of these
functions and you'll unlikely want to avoid inlining as the performance gain
and avoidance of explicit calls are significant benefits in the math functions 
(no need to use callee-saves).

I suppose adding a size check for fpclassify would be reasonable as the 
expansion ends up quite large (but that should really be fixed). 
Note fpclassify is used rather inappropriately in most cases in GLIBC, so
the best approach would be rewrite those to use faster isnan/isinf checks.

> > As a result of this many target overrides and the various
> > __isnan/__finite inlines in math_private.h are no longer required. If
> > agreed we could remove all this code and only keep the generic
> > definition of isinf/etc which will use the builtin.
> 
> How do those inlines compare with the code GCC generates?  As I understand
> it, before your patch libm-internal calls to these macros would have
> called the inline versions of functions such as __finite (no function
> call, PLT or otherwise, and using integer arithmetic), and after the patch
> libm-internal calls would have used the GCC inlines (with floating-point
> arithmetic) - is the new state for libm-internal calls better than the old
> state or not?

__isinf(f/l) were never inlined, but are now (__isinf_ns was used in some cases,
but not nearly all). Fpclassify*, isnormal* were never inlined, but now are. These
new inlines provide the gains.

__isnan(f) and __finite(f) used GLIBC internal inlines, and now use GCC built-ins 
instead. We could measure whether this specific change causes any difference, but 
I bet it will be very uarch specific and minor compared to the total time math
functions take.

Also I don't think having special inlines that are only used inside GLIBC is a 
good approach - if the GCC built-ins are not fast enough then we should fix them.

Wilco



Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]