This is the mail archive of the
mailing list for the glibc project.
Re: PPC64 libmvec sincos/sincosf ABI
- From: Bill Schmidt <wschmidt at linux dot ibm dot com>
- To: GT <tnggil at protonmail dot com>, "libc-alpha at sourceware dot org" <libc-alpha at sourceware dot org>
- Date: Fri, 20 Sep 2019 15:25:52 -0500
- Subject: Re: PPC64 libmvec sincos/sincosf ABI
- References: <VI1PR0801MB2127D8F615416EA63BBEF4E583D50@VI1PR0801MB2127.eurprd08.prod.outlook.com> <email@example.com> <bPU5suQJKGq4tSJT5Ql-a4CHhOfAzI6bEPBnVxzjR5_MRWpTITv2LueySiGKZjGzI2lnmxgmk9bn6oXcfKUp6JXbsGkVpm5k0kuFUq2Mgzofirstname.lastname@example.org>
- Reply-to: wschmidt at linux dot ibm dot com
On 9/20/19 2:25 PM, GT wrote:
‐‐‐‐‐‐‐ Original Message ‐‐‐‐‐‐‐
On Thursday, August 8, 2019 11:25 AM, Bill Schmidt email@example.com wrote:
Let me jump in here to answer a general question that I think Bert has
had for a while.
For the PPC64LE ABI, we should be returning everything through registers
wherever possible. The ABI supports multiple return values of the same
type (up to 8 vector return values, for example), using the same
registers used for passing parameters. For simplicity in this example,
I'll use the AltiVec-style types (vector double), but this works
identically if you use more generically defined vector types.
vector double sinvals;
vector double cosvals;
mysincos (vector double a)
struct sincosret scr;
scr.sinvals = a+a; // May be slightly incorrect
scr.cosvals = a*a; // Ditto
This will result in the values being returned in VR2 and VR3:
This is preferable to returning values indirectly through memory, which
on older POWER processors can result in stalls from the store and load
being too close together and possibly executed out of order. The cost
is pretty much negligible compared to the cost of computing sin/cos, but
we might as well do it the best way that the ABI provides.
I believe we can now answer the issues that Joseph raised earlier in this thread.
Those questions are here: https://sourceware.org/ml/libc-alpha/2019-08/msg00022.html
The PowerPC64 double-precision vector sincos will have this as its prototype:
struct sincosret _ZGVbN2v_sincos (vector double);
The corresponding single-precision vector sincosf will have a prototype:
struct sincosretf _ZGVbN4v_sincosf (vector float);
We also need a new attribute that will indicate when scalar sincos[f] in a loop can be vectorized using the newly redefined PowerPC64 vector sincos[f] functions. None of the existing attributes can be used since the technique used to return multiple values in registers is new AFAIU. So, Bill, are you the designer who can attest that what is agreed to here for the sincos API and ABI will be faithfully reflected in the ABI document?
Sure, I can work together with you on this. I agree that a new
attribute is needed. The term we use for this in our existing ELFv2 ABI
document is "homogeneous aggregates," so it would be good if the name of
the attribute could reflect that the interface returns a homogeneous
aggregate. This is a bit of a mouthful, so may require some shortening.
As far as the new ABI document goes, I think we are looking to you to
complete the proposal of interfaces, attributes, and so forth so that
the document can be written. I am the right person to work with on this.