3696
Comment:

4693

Deletions are marked like this.  Additions are marked like this. 
Line 35:  Line 35: 
==== Usage model and example for x86_64 ====  ==== Usage model and examples for x86_64 ==== ===== Example 1 ===== 
Line 38:  Line 39: 
The next code in file cos.c:  The following code in file cos.c: 
Line 62:  Line 63: 
being built by gcc 4.9.0 with the following command:  being built by GCC 4.9.0 with the following command: 
Line 68:  Line 69: 
===== Example 2 ===== Starting from Glibc 2.23 vector math functions can be used also w/o OpenMP with compilers supporting ''__attribute__ ((__simd__))'', for instance with GCC 6.* (and later) with options ''ftreeloopvectorize ffastmath'' starting from optimization level ''O1''. The following code in file sin.c: {{{ #include <math.h> int N = 3200; double b[3200]; double a[3200]; int main (void) { int i; for (i = 0; i < N; i += 1) { b[i] = sin (a[i]); } return (0); } }}} being built by GCC 6.* with the following command: ''gcc ./sin.c O1 ftreeloopvectorize ffastmath lm mavx'' produces binary with call to AVX version of vectorized sin (which name is _ZGVcN4v_sin) if Glibc installed systemwide with not disabled Libmvec (in the case of not systemwide installation additional options I/PATH_TO_GLIBC_INSTALL/include/ L/PATH_TO_GLIBC_INSTALL/lib/ needed). 
Libmvec
Libmvec is vector math library added in Glibc 2.22.
Vector math library was added to support SIMD constructs of OpenMP4.0 (#2.8 in http://www.openmp.org/mpdocuments/OpenMP4.0.0.pdf) by adding vector implementations of vector math functions.
Vector math functions are vector variants of corresponding scalar math operations implemented using SIMD ISA extensions (e.g. SSE or AVX for x86_64). They take packed vector arguments, perform the operation on each element of the packed vector argument, and return a packed vector result. Using vector math functions is faster than repeatedly calling the scalar math routines.
The library can be disabled with disablemathvec configure option.
The library libmvec.so is linked in as needed when using lm, no need to specify lmvec explicitly.
For static builds with gcc needed to add both options with the following order: lmvec lm.
For static builds with g++ needed to add only lmvec (lm passed to linker by driver).
Libmvec on x86_64
Build and testing enabled by default.
Vector versions differ from the scalar analogues in accuracy and behavior on special values.
Functions are optimized for performance in their respective domains if processing doesn’t incur special values like denormal values, over and underflow, and out of range. Special values processing is done in a scalar fashion via respective scalar routine calls. Additionally functions like trigonometric may resort to scalar processing of huge (or other) arguments that do not necessarily cause special values, but rather require different and less SIMDfriendly handling.
These functions were tested (via reasonable random sampling) to pass 4ulp maximum relative error criterion on their domains in roundtonearest computation mode. Known limitations:
C99 compliance in terms of special values, errno:
 Functions may not raise exceptions as required by C language standard. Functions may raise spurious exceptions. This is considered an artifact of SIMD processing and may be fixed in the future on the casebycase basis
 Functions may not change errno in some of the required cases, e.g. if the SIMD friendly algorithm is done branchfree without a libm call for that value. This is done for performance reasons
 As the implementation is dependent on libm, some accuracy and special case problems may be inherent to this fact
 Functions do not guarantee fully correct results in computation modes different from roundtonearest one
Vector ABI
For x86_64 vector functions names are created based on #2.6. Vector Function Name Mangling from Vector ABI VectorABI.txt which has been discussed at X8664 System V Application Binary Interface mail list.
Usage model and examples for x86_64
Example 1
Use of the vector math functions could be enabled with fopenmp ffastmath starting from optimization level O1 (for GCC starting from version 4.9.0) and appropriate usage of OpenMP SIMD constructs.
The following code in file cos.c:
#include <math.h> int N = 3200; double b[3200]; double a[3200]; int main (void) { int i; #pragma omp simd for (i = 0; i < N; i += 1) { b[i] = cos (a[i]); } return (0); }
being built by GCC 4.9.0 with the following command:
gcc ./cos.c O1 fopenmp ffastmath lm mavx2
produces binary with call to AVX2 version of vectorized cos (which name is _ZGVdN4v_cos) if Glibc installed systemwide with not disabled Libmvec. In the case of not systemwide installation use additional I/PATH_TO_GLIBC_INSTALL/include/ L/PATH_TO_GLIBC_INSTALL/lib/ options.
Example 2
Starting from Glibc 2.23 vector math functions can be used also w/o OpenMP with compilers supporting __attribute__ ((__simd__)), for instance with GCC 6.* (and later) with options ftreeloopvectorize ffastmath starting from optimization level O1.
The following code in file sin.c:
#include <math.h> int N = 3200; double b[3200]; double a[3200]; int main (void) { int i; for (i = 0; i < N; i += 1) { b[i] = sin (a[i]); } return (0); }
being built by GCC 6.* with the following command:
gcc ./sin.c O1 ftreeloopvectorize ffastmath lm mavx
produces binary with call to AVX version of vectorized sin (which name is _ZGVcN4v_sin) if Glibc installed systemwide with not disabled Libmvec (in the case of not systemwide installation additional options I/PATH_TO_GLIBC_INSTALL/include/ L/PATH_TO_GLIBC_INSTALL/lib/ needed).