This is the mail archive of the
libc-alpha@sourceware.org
mailing list for the glibc project.
Re: [PATCH 1/3] Update s_sincosf.c and x86-64 s_sincosf-fma.c
- From: Szabolcs Nagy <Szabolcs dot Nagy at arm dot com>
- To: "H.J. Lu" <hjl dot tools at gmail dot com>, Adhemerval Zanella <adhemerval dot zanella at linaro dot org>
- Cc: nd <nd at arm dot com>, GNU C Library <libc-alpha at sourceware dot org>
- Date: Mon, 3 Dec 2018 10:47:27 +0000
- Subject: Re: [PATCH 1/3] Update s_sincosf.c and x86-64 s_sincosf-fma.c
- References: <20181121181102.27119-1-hjl.tools@gmail.com> <20181121181102.27119-2-hjl.tools@gmail.com> <ff5a08ee-7081-e6a0-f759-14c62c6dd77b@linaro.org> <CAMe9rOpAWASuW1_b-gTEfHz1QGFef5yb6OP_mgZTbc-+h1i85w@mail.gmail.com>
On 30/11/18 21:31, H.J. Lu wrote:
> On Fri, Nov 30, 2018 at 12:40 PM Adhemerval Zanella
> <adhemerval.zanella@linaro.org> wrote:
>> On 21/11/2018 16:11, H.J. Lu wrote:
>>> Include <s_sincosf.h> in s_sincosf.c, instead of "s_sincosf.h", to allow
>>> x86-64 s_sincosf.h with vectorized sincosf_poly. Update __sincosf_table
...
>>> --- a/sysdeps/ieee754/flt-32/s_sincosf.h
>>> +++ b/sysdeps/ieee754/flt-32/s_sincosf.h
>>> @@ -31,8 +31,24 @@ typedef struct
>>> double sign[4]; /* Sign of sine in quadrants 0..3. */
>>> double hpi_inv; /* 2 / PI ( * 2^24 if !TOINT_INTRINSICS). */
>>> double hpi; /* PI / 2. */
>>> - double c0, c1, c2, c3, c4; /* Cosine polynomial. */
>>> - double s1, s2, s3; /* Sine polynomial. */
>>> + /* Cosine polynomial: c0, c1, c2, c3, c4.
>>> + Sine polynomial: s1, s2, s3. */
>>> + double c0, c1;
>>> + struct
>>> + {
>>> + double s1;
>>> + double c2;
>>> + } s1c2;
>>> + struct
>>> + {
>>> + double s2;
>>> + double c3;
>>> + } s2c3;
>>> + struct
>>> + {
>>> + double s3;
>>> + double c4;
>>> + } s3c4;
>>> } sincos_t;
>>>
>>
>> I don't think this should be a problem for other architectures, do you see
>> any possible issue about changing the layout?
>
> Not I can think of.
if you change the layout it's likely a measurable
regression on aarch64 targets because it was optimized
such that ldp (load pair) can be used when loading
coeffs (although this should not have a huge effect).
i think it should be possible to override the header
and data files with target specific definitions (or
use ifdefs based on some target macros) so we can
have both layouts.