This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

RE: [PATCH] Inline C99 math functions

From: "Wilco Dijkstra" <wdijkstr at arm dot com>
To: "'Joseph Myers'" <joseph at codesourcery dot com>
Cc: "GNU C Library" <libc-alpha at sourceware dot org>
Date: Wed, 17 Jun 2015 18:03:08 +0100
Subject: RE: [PATCH] Inline C99 math functions
Authentication-results: sourceware.org; auth=none
References: <001201d0a75b$921d9860$b658c920$ at com> <alpine dot DEB dot 2 dot 10 dot 1506151431490 dot 26683 at digraph dot polyomino dot org dot uk> <001701d0a789$f2ab86f0$d80294d0$ at com> <alpine dot DEB dot 2 dot 10 dot 1506151654100 dot 26683 at digraph dot polyomino dot org dot uk> <001801d0a84c$8c5cd7a0$a51686e0$ at com> <alpine dot DEB dot 2 dot 10 dot 1506161606550 dot 16478 at digraph dot polyomino dot org dot uk>

> Joseph Myers wrote:
> On Tue, 16 Jun 2015, Wilco Dijkstra wrote:
> 
> > > Well, the benchmark should come first....
> >
> > I added a new math-inlines benchmark based on the string benchmark
> > infrastructure.
> 
> Thanks.  I await the patch submission.

See https://sourceware.org/ml/libc-alpha/2015-06/msg00569.html

> > So this clearly shows the GCC built-ins win by a huge margin, including the
> > inline versions. It also shows that multiple isinf/isnan calls would be faster
> 
> That's interesting information - suggesting that changes in GCC to use
> integer arithmetic should be conditional on -fsignaling-nans, if doing the
> operations by integer arithmetic is slower (at least on this processor).
> 
> (It also suggests it's safe to remove the existing glibc-internal inlines
> as part of moving to using the built-in functions when possible.)

Indeed. To check which sequence is better we'd need to write a better benchmark,
maybe base it on a GLIBC function which uses these functions in the hot path.

> > > > Codesize of what? Few applications use these functions... GLIBC mathlib is
> > >
> > > Size of any code calling these macros (for nonconstant arguments).
> >
> > Well the size of the __isinf_t function is 160 bytes vs isinf_t 84 bytes
> > due to the callee-save overhead of the function call. The builtin isinf uses
> > 3 instructions inside the loop plus 3 lifted before it, while the call to
> > __isinf needs 3 plus a lot of code to save/restore the callee-saves.
> 
> One might suppose that most functions using these macros contain other
> function calls as well, and so that the callee-save overhead should not be
> included in the comparison.

That may be true in some cases, but if you can tailcall (which might be possible
in several math veneers) then the callee-save savings would apply.

> When you exclude callee-save overhead, how do things compare for
> fpclassify (the main case where inlining may be questionable when
> optimizing for size)?

Well in the worst-case scenario where you need all 5 tests of fpclassify it 
effectively changes a single-instruction call into 16 instructions plus 2 double 
immediate. So it is best to use OPTIMIZE_SIZE for fpclassify for now and revisit
when the GCC implementation has been improved. I also wonder what the difference
would be once I've optimized the __fpclassify implementation - I can do it in
about 8-9 instructions. 

Wilco

References:
- [PATCH] Inline C99 math functions
  - From: Wilco Dijkstra
- Re: [PATCH] Inline C99 math functions
  - From: Joseph Myers
- RE: [PATCH] Inline C99 math functions
  - From: Wilco Dijkstra
- RE: [PATCH] Inline C99 math functions
  - From: Joseph Myers
- RE: [PATCH] Inline C99 math functions
  - From: Wilco Dijkstra
- RE: [PATCH] Inline C99 math functions
  - From: Joseph Myers

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]