This is the mail archive of the
mailing list for the glibc project.
Re: RFC: Creating a more efficient sincos interface
- From: Florian Weimer <fweimer at redhat dot com>
- To: Wilco Dijkstra <Wilco dot Dijkstra at arm dot com>, "libc-alpha at sourceware dot org" <libc-alpha at sourceware dot org>, "gcc at gcc dot gnu dot org" <gcc at gcc dot gnu dot org>
- Cc: nd <nd at arm dot com>
- Date: Thu, 13 Sep 2018 15:52:21 +0200
- Subject: Re: RFC: Creating a more efficient sincos interface
- References: <HE1PR08MB1035741787AE270399E22EA7831B0@HE1PR08MB1035.eurprd08.prod.outlook.com>
On 09/13/2018 03:27 PM, Wilco Dijkstra wrote:
The existing sincos functions use 2 pointers to return the sine and cosine result. In
most cases 4 memory accesses are necessary per call. This is inefficient and often
significantly slower than returning values in registers. I ran a few experiments on the
new optimized sincosf implementation in GLIBC using the following interface:
__complex__ float sincosf2 (float);
This has 50% higher throughput and a 25% reduction in latency on Cortex-A72 for
random inputs in the range +-PI/4. Larger inputs take longer and thus have lower
gains, but there is still a 5% gain on the (rarely used) path with full range reduction.
Given sincos is used in various HPC applications this can give a worthwile speedup.
I think this is totally fine if you call it expif or something like that
(and put the sine in the imaginary part, of course).
In general, I would object to using complex numbers for arbitrary pairs,
but this doesn't apply to this case.