This is the mail archive of the
mailing list for the glibc project.
Re: PPC64 libmvec sincos/sincosf ABI
- From: Bill Schmidt <wschmidt at linux dot ibm dot com>
- To: GT <tnggil at protonmail dot com>, "libc-alpha\\@sourceware.org" <libc-alpha at sourceware dot org>
- Date: Tue, 24 Sep 2019 11:43:24 -0500
- Subject: Re: PPC64 libmvec sincos/sincosf ABI
- References: <VI1PR0801MB2127D8F615416EA63BBEF4E583D50@VI1PR0801MB2127.eurprd08.prod.outlook.com> <firstname.lastname@example.org> <bPU5suQJKGq4tSJT5Ql-a4CHhOfAzI6bEPBnVxzjR5_MRWpTITv2LueySiGKZjGzI2lnmxgmk9bn6oXcfKUp6JXbsGkVpm5k0kuFUq2Mgzoemail@example.com> <firstname.lastname@example.org> <o5VYmNZRp51Qgi9JBVzmHUWv6RPBR-rrIBwpD1p2Du8byYqle3gjhKfjlcY6oKmE4JnEeYdhnf__-i1dkHOBwvEdZ_dU5LtKCgPANKpzGJ8email@example.com>
- Reply-to: wschmidt at linux dot ibm dot com
Hi! Please CC me directly as I don't read libc-alpha religiously every day.
On 9/23/19 1:02 PM, GT wrote:
Sure, I can work together with you on this. I agree that a new
attribute is needed. The term we use for this in our existing ELFv2 ABI
document is "homogeneous aggregates," so it would be good if the name of
the attribute could reflect that the interface returns a homogeneous
aggregate. This is a bit of a mouthful, so may require some shortening.
How about this for the attribute specification:
It's rather long, but there already exist attribute names of similar length, like
I like it. Good choice.
As far as the new ABI document goes, I think we are looking to you to
complete the proposal of interfaces, attributes, and so forth so that
the document can be written. I am the right person to work with on this.
I plan on reusing and adapting GCC's implementation of function cos as much as
possible. Nothing special about cos. Could just as well say reuse/adapt from
Sincos differs from cos in that the scalar function has 2 extra input arguments;
the pointers to locations in which to store the sine and cosine results. So:
1. Prior to GCC making the vectorized cos call, arguments from multiple scalar
cosine calls are assembled into a single input vector argument to the vector cos
function. I think this part of code can be used almost verbatim for sincos. The
reason is that the first argument to sincos is passed by value and is in fact the
exact same value that would be passed to scalar sin and cos separately.
2. On return from the vector cos call, GCC extracts scalar results from the returned
vector output and assigns each to its respective scalar variable. Much of the code
here can be reused as long as a few changes are made:
i. When assembling the vector sincos call, each scalar call's 2nd and 3rd arguments
must be saved so that results will later be written to those locations.
ii. On return from the vector sincos, the code needs to account for the fact that scalar
results go to locations given by pointers rather than to named variables for cos.
Have I overlooked any significant issue?
I think that's fine from an ABI perspective. Implementation-wise, in the
most common case we would expect the combined scalar calls' pointer
arguments to be consecutive, in which case we can optimize to do vector
stores from the aggregate return registers (v2, v3). But we have to be
prepared to distribute the scalars independently if necessary.