This is the mail archive of the
libc-alpha@sourceware.org
mailing list for the glibc project.
Re: [PATCH 1/N] x86_64 vectorization support: vectorized math functions addition to Glibc
- From: Andrew Senkevich <andrew dot n dot senkevich at gmail dot com>
- To: Joseph Myers <joseph at codesourcery dot com>
- Cc: "Zamyatin, Igor" <igor dot zamyatin at intel dot com>, Jakub Jelinek <jakub at redhat dot com>, libc-alpha <libc-alpha at sourceware dot org>
- Date: Thu, 27 Nov 2014 19:12:42 +0300
- Subject: Re: [PATCH 1/N] x86_64 vectorization support: vectorized math functions addition to Glibc
- Authentication-results: sourceware.org; auth=none
- References: <CAMXFM3t=ppndDUBzHzSus7xyuF5hTaLFZ5b273jD39NtddSvsw at mail dot gmail dot com> <Pine dot LNX dot 4 dot 64 dot 1409101549490 dot 12853 at digraph dot polyomino dot org dot uk> <6D39441BF12EF246A7ABCE6654B0235320F09D65 at LEMAIL01 dot le dot imgtec dot org> <20140911210246 dot GN23797 at brightrain dot aerifal dot cx> <87a9655rnu dot fsf at tassilo dot jf dot intel dot com> <20140912074251 dot GZ17454 at tucnak dot redhat dot com> <Pine dot LNX dot 4 dot 64 dot 1409121700030 dot 1118 at digraph dot polyomino dot org dot uk> <20140912170827 dot GE17454 at tucnak dot redhat dot com> <CAMXFM3u5DM_W=iiVReBszH4TY4Wwf3Vm7d79chwdfD_J5tTz5A at mail dot gmail dot com> <20141112175149 dot GK5026 at tucnak dot redhat dot com> <alpine dot DEB dot 2 dot 10 dot 1411121809000 dot 23958 at digraph dot polyomino dot org dot uk> <0EFAB2BDD0F67E4FB6CCC8B9F87D756969C63403 at IRSMSX152 dot ger dot corp dot intel dot com> <alpine dot DEB dot 2 dot 10 dot 1411172353190 dot 3980 at digraph dot polyomino dot org dot uk>
2014-11-18 2:55 GMT+03:00 Joseph Myers <joseph@codesourcery.com>:
> On Mon, 17 Nov 2014, Zamyatin, Igor wrote:
>
>> > An alternative to having a processor clause now would be having an ABI/API
>> > document for OpenMP on x86_64 - agreed between implementations - that
>> > specifies what vector versions of a function the standard pragma means are
>> > available, and specifies that implementations must not generate calls to
>> > versions not listed unless some non-standard pragma is used to declare
>> > those other versions to be available (which would put off defining such a
>> > non-standard pragma until there is a desire to have vector versions for
>> > newer ISAs).
>>
>> We can prepare a document that describes what compiler (gcc 4.9 and
>> gcc5) can generate (and of course make sure that we have all those
>> versions in glibc) for x86_64 and put it somewhere on gcc.gnu.org (e.g.
>> Release notes?) and, say, on glibc wiki. Will it be enough for now?
>
> I'm thinking of a document that multiple implementations have accepted as
> describing the intended semantics of the pragma as regard what function
> versions may be assumed to be present, so that we can expect glibc using
> that pragma in installed headers to work with future versions of multiple
> compilers, rather than something GCC-specific.
Joseph,
here is draft version of such a document, could you please review it?
GLIBC 2.21 VECTOR MATH FUNCTIONS X86_64 ABI/API
This document describes x86_64 API of vector math functions which
added in Glibc 2.21 and contains the following parts:
1. Vector math functions
2. Auto vectorization and usage model with GCC
3. Variants of available vector math functions names
4. List of vector functions and their ISA specific names
1. Vector math functions
Vector math functions are vector variants of corresponding scalar math
operations implemented currently using SIMD ISA extensions SSE4, AVX
and AVX2 (AVX version for now implemented as wrapper with two calls of
SSE4 version). They take packed vector arguments, perform the
operation on each element of the packed vector argument, and return a
packed vector result.
Vector math functions are expected to be faster than repeatedly called
scalar equivalents in most cases. However, these vector versions
differ from the scalar analogues in accuracy and behavior on special
values. Functions are optimized for performance on their respective
domains if processing doesnât incur special values like denormal
values, over- and under-flow, and out of range. Special values
processing is done in a scalar fashion via respective scalar routine
calls. Additionally functions like trigonometric may resort to scalar
processing of huge (or other) arguments that do not necessarily cause
special values, but rather require different and less SIMD-friendly
handling.
These functions tested to pass 4-ulp maximum relative error criterion
on their domains in round-to-nearest computation mode.
C99 compliance in terms of special values, errno:
a) Functions may not raise exceptions as required by C language
standard. Functions may raise spurious exceptions. This is considered
an artifact of SIMD processing and may be fixed in the future on the
case-by-case basis.
b) Functions may not change errno in some of the required cases, e.g.
if the SIMD friendly algorithm is done branch-free without a libm call
for that value. This is done for performance reasons.
c) As the implementation is dependent on libm, some accuracy and
special case problems may be inherent to this fact.
d) Functions do not guarantee fully correct results in computation
modes different from round-to-nearest one.
2. Auto vectorization and usage model with GCC
Vector math functions were added to Glibc with goal to utilize SIMD
constructs of OpenMP4.0 (#2.8 in
http://www.openmp.org/mp-documents/OpenMP4.0.0.pdf). Cilk Plus SIMD
constructs support will be also added later.
Standard header math.h was changed with addition of OpenMP declare
simd directive for functions which have vector versions.
This directive has clauses for specifying additional properties of
vector implementations, for instance for vector function âcosâ
implemented in SSE4 ISA added
#pragma omp declare simd notinbranch simdlen(2)
to its declaration in math.h.
Starting from version 4.9 GCC requires command like
gcc test.c -I/PATH_TO_GLIBC_2.21/include/ -L/PATH_TO_GLIBC_2.21/lib/
-fopenmp -ffast-math -lm âO1
(with architecture selection with -maxv, -mavx2 or default -msse4)
for auto vectorization of the following code in test.c:
#include <math.h>
int N = 3200;
double b[3200];
double a[3200];
int main (void)
{
int i;
#pragma omp simd
for (i=0; i<N; i+=1)
{
b[i]=cos (a[i]);
}
return (0);
}
Exact names of functions to which compiler can generate calls are
described in the next part.
3. Variants of available vector math functions names
Name of vector function created by GCC is based on Intel Vector
Function ABI (http://www.cilkplus.org/sites/default/files/open_specifications/Intel-ABI-Vector-Function-2012-v0.9.5.pdf)
with a little difference in part of name specifying ISA â namely
letters b, c, d instead of x, y, Y.
For compatibility with GCC according names was taken for vector math
functions in Glibc.
#pragma omp declare simd notinbranch simdlen(2) for some function
âfuncâ means what the name of vector version is:
_ZGVbN2v_func (it is SSE4 implementation).
#pragma omp declare simd notinbranch simdlen(4) for some function
âfuncâ means what the following names are available:
_ZGVcN4v_func (it is AVX implementation)
and
_ZGVdN4v_func (it is AVX2 implementation).
Every vector function should be provided by Glibc for each supported
ISA (currently SSE4, AVX and AVX2).
4. List of vector functions and their ISA specific names
Glibc 2.21 contains the following vector version names of math functions:
a) cos: _ZGVbN2v_cos, _ZGVcN4v_cos, _ZGVdN4v_cos
--
WBR,
Andrew