This is the mail archive of the
libc-alpha@sourceware.org
mailing list for the glibc project.
Re: [Aarch64] libmvec development status
- From: Wilco Dijkstra <Wilco dot Dijkstra at arm dot com>
- To: "Andrew dot pinski at cavium dot com" <Andrew dot pinski at cavium dot com>, Szabolcs Nagy <Szabolcs dot Nagy at arm dot com>, Ramana Radhakrishnan <Ramana dot Radhakrishnan at arm dot com>, "Ashwin dot Sekhar at cavium dot com" <Ashwin dot Sekhar at cavium dot com>
- Cc: "libc-alpha at sourceware dot org" <libc-alpha at sourceware dot org>, nd <nd at arm dot com>
- Date: Thu, 16 Mar 2017 19:49:20 +0000
- Subject: Re: [Aarch64] libmvec development status
- Authentication-results: sourceware.org; auth=none
- Authentication-results: sourceware.org; dkim=none (message not signed) header.d=none;sourceware.org; dmarc=none action=none header.from=arm.com;
- Nodisclaimer: True
- Spamdiagnosticmetadata: NSPM
- Spamdiagnosticoutput: 1:99
Andrew Pinski wrote:
> The main justification is that Ashwin is working on the libmvec too.
> He has proposed the ABI:
> https://gcc.gnu.org/ml/gcc/2017-03/msg00077.html
>
> Basically I would like this collaboration upstream rather than in the
> private and not on the mailing list. Also delaying upstreaming the
> base support means there will be two versions out there in the wild
> starting soon. This is not a good thing.
Agreed, there is no point in having 2 ABIs for the same feature.
> I think he means core specific versions. For an example it might make
> sense to have a different version that is specific to ThunderX2
> CN99xx. There are some specific instructions sequences are faster to
> do on cn99xx compared to other cores.
My general feeling is that the scope for microarchitecture specific tuning is
very limited. Most of the gains are due to (a) having a vector math function in
the first place, and (b) good algorithm&polynomial. After that you're typically
limited by FMUL/FMA latency with little potential for improvement (eg. using
FMA may be essential to achieve the ULP goal, and the polynomial might have
been designed for FMA, so changing it would increase the worst-case error,
potentially significantly so).
Wilco