3027
Comment:

← Revision 13 as of 20160224 11:45:50 ⇥
4771

Deletions are marked like this.  Additions are marked like this. 
Line 2:  Line 2: 
Line 7:  Line 6: 
Vector math functions are vector variants of corresponding scalar math operations implemented using SIMD ISA extensions (e.g. SSE or AVX). They take packed vector arguments, perform the operation on each element of the packed vector argument, and return a packed vector result. Using vector math functions is faster than repeatedly calling the scalar math routines. However, these vector versions differ from the scalar analogues in accuracy and behavior on special values. Functions are optimized for performance in their respective domains if processing doesn’t incur special values like denormal values, over and underflow, and out of range. Special values processing is done in a scalar fashion via respective scalar routine calls. Additionally functions like trigonometric may resort to scalar processing of huge (or other) arguments that do not necessarily cause special values, but rather require different and less SIMDfriendly handling.  Vector math functions are vector variants of corresponding scalar math operations implemented using SIMD ISA extensions (e.g. SSE or AVX for x86_64). They take packed vector arguments, perform the operation on each element of the packed vector argument, and return a packed vector result. Using vector math functions is faster than repeatedly calling the scalar math routines. The library can be disabled with disablemathvec configure option. The library libmvec.so is linked in as needed when using ''lm'', no need to specify ''lmvec'' explicitly. For static builds with gcc needed to add both options with the following order: lmvec lm. For static builds with g++ needed to add only lmvec (lm passed to linker by driver). ==== Libmvec on x86_64 ==== Build and testing enabled by default. Vector versions differ from the scalar analogues in accuracy and behavior on special values. Functions are optimized for performance in their respective domains if processing doesn’t incur special values like denormal values, over and underflow, and out of range. Special values processing is done in a scalar fashion via respective scalar routine calls. Additionally functions like trigonometric may resort to scalar processing of huge (or other) arguments that do not necessarily cause special values, but rather require different and less SIMDfriendly handling. 
Line 13:  Line 27: 
a. Functions may not raise exceptions as required by C language standard. Functions may raise spurious exceptions. This is considered an artifact of SIMD processing and may be fixed in the future on the casebycase basis. b. Functions may not change errno in some of the required cases, e.g. if the SIMD friendly algorithm is done branchfree without a libm call for that value. This is done for performance reasons. c. As the implementation is dependent on libm, some accuracy and special case problems may be inherent to this fact. d. Functions do not guarantee fully correct results in computation modes different from roundtonearest one. 
a. Functions may not raise exceptions as required by C language standard. Functions may raise spurious exceptions. This is considered an artifact of SIMD processing and may be fixed in the future on the casebycase basis a. Functions may not change errno in some of the required cases, e.g. if the SIMD friendly algorithm is done branchfree without a libm call for that value. This is done for performance reasons a. As the implementation is dependent on libm, some accuracy and special case problems may be inherent to this fact a. Functions do not guarantee fully correct results in computation modes different from roundtonearest one 
Line 19:  Line 33: 
For x86_64 vector functions names are created based on #2.6. Vector Function Name Mangling from Vector ABI [[attachment:VectorABI.txt]] which has been discussed at X8664 System V Application Binary Interface mailing list (thread is https://groups.google.com/forum/#!topic/x8664abi/LmppCfN1rZ4).  
Line 20:  Line 35: 
For x86_64 vector functions names are created based on #2.6. Vector Function Name Mangling from Vector ABI (attached) which has been discussed at X8664 System V Application Binary Interface mail list.  ==== Usage model and examples for x86_64 ==== ===== Example 1 ===== Use of the vector math functions could be enabled with ''fopenmp ffastmath'' starting from optimization level ''O1'' (for GCC starting from version 4.9.0) and appropriate usage of OpenMP SIMD constructs. 
Line 22:  Line 39: 
==== Usage model ====  The following code in file cos.c: 
Line 24:  Line 41: 
Call to vector math function could be created by GCC (starting from version 4.9.0) if developer used OpenMP SIMD constructs and fopenmp ffastmath passed.  {{{ #include <math.h> 
Line 26:  Line 44: 
==== Example ====  int N = 3200; double b[3200]; double a[3200]; 
Line 28:  Line 48: 
The next code in file cos.c:  int main (void) { int i; 
Line 30:  Line 52: 
#pragma omp declare simd extern double cos(double); int N = 300; double b[300]; double a[300]; 
#pragma omp simd for (i = 0; i < N; i += 1) { b[i] = cos (a[i]); } 
Line 36:  Line 58: 
int main(void) { int i; #pragma omp simd for (i = 0; i < N; i += 1) { b[i] = cos (a[i]); } return (0); 
return (0); 
Line 47:  Line 61: 
being built by gcc 4.9.0 with the following command: gcc ./cos.c I/PATH_TO_GLIBC_INSTALL/include L/ PATH_TO_GLIBC_INSTALL/lib/ O1 fopenmp lm mavx2 produces binary with nm a.out  grep ZGV  }}} 
Line 49:  Line 63: 
U _ZGVdN4v_cos@@GLIBC_2.21  being built by GCC 4.9.0 with the following command: ''gcc ./cos.c O1 fopenmp ffastmath lm mavx2'' produces binary with call to AVX2 version of vectorized cos (which name is _ZGVdN4v_cos) if Glibc installed systemwide with not disabled Libmvec. In the case of not systemwide installation use additional I/PATH_TO_GLIBC_INSTALL/include/ L/PATH_TO_GLIBC_INSTALL/lib/ options. ===== Example 2 ===== Starting from Glibc 2.23 vector math functions can be used also w/o OpenMP with compilers supporting ''__attribute__ ((__simd__))'', for instance with GCC 6.* (and later) with options ''ftreeloopvectorize ffastmath'' starting from optimization level ''O1''. The following code in file sin.c: {{{ #include <math.h> int N = 3200; double b[3200]; double a[3200]; int main (void) { int i; for (i = 0; i < N; i += 1) { b[i] = sin (a[i]); } return (0); } }}} being built by GCC 6.* with the following command: ''gcc ./sin.c O1 ftreeloopvectorize ffastmath lm mavx'' produces binary with call to AVX version of vectorized sin (which name is _ZGVcN4v_sin) if Glibc installed systemwide with not disabled Libmvec (in the case of not systemwide installation additional options I/PATH_TO_GLIBC_INSTALL/include/ L/PATH_TO_GLIBC_INSTALL/lib/ needed). 
Libmvec
Libmvec is vector math library added in Glibc 2.22.
Vector math library was added to support SIMD constructs of OpenMP4.0 (#2.8 in http://www.openmp.org/mpdocuments/OpenMP4.0.0.pdf) by adding vector implementations of vector math functions.
Vector math functions are vector variants of corresponding scalar math operations implemented using SIMD ISA extensions (e.g. SSE or AVX for x86_64). They take packed vector arguments, perform the operation on each element of the packed vector argument, and return a packed vector result. Using vector math functions is faster than repeatedly calling the scalar math routines.
The library can be disabled with disablemathvec configure option.
The library libmvec.so is linked in as needed when using lm, no need to specify lmvec explicitly.
For static builds with gcc needed to add both options with the following order: lmvec lm.
For static builds with g++ needed to add only lmvec (lm passed to linker by driver).
Libmvec on x86_64
Build and testing enabled by default.
Vector versions differ from the scalar analogues in accuracy and behavior on special values.
Functions are optimized for performance in their respective domains if processing doesn’t incur special values like denormal values, over and underflow, and out of range. Special values processing is done in a scalar fashion via respective scalar routine calls. Additionally functions like trigonometric may resort to scalar processing of huge (or other) arguments that do not necessarily cause special values, but rather require different and less SIMDfriendly handling.
These functions were tested (via reasonable random sampling) to pass 4ulp maximum relative error criterion on their domains in roundtonearest computation mode. Known limitations:
C99 compliance in terms of special values, errno:
 Functions may not raise exceptions as required by C language standard. Functions may raise spurious exceptions. This is considered an artifact of SIMD processing and may be fixed in the future on the casebycase basis
 Functions may not change errno in some of the required cases, e.g. if the SIMD friendly algorithm is done branchfree without a libm call for that value. This is done for performance reasons
 As the implementation is dependent on libm, some accuracy and special case problems may be inherent to this fact
 Functions do not guarantee fully correct results in computation modes different from roundtonearest one
Vector ABI
For x86_64 vector functions names are created based on #2.6. Vector Function Name Mangling from Vector ABI VectorABI.txt which has been discussed at X8664 System V Application Binary Interface mailing list (thread is https://groups.google.com/forum/#!topic/x8664abi/LmppCfN1rZ4).
Usage model and examples for x86_64
Example 1
Use of the vector math functions could be enabled with fopenmp ffastmath starting from optimization level O1 (for GCC starting from version 4.9.0) and appropriate usage of OpenMP SIMD constructs.
The following code in file cos.c:
#include <math.h> int N = 3200; double b[3200]; double a[3200]; int main (void) { int i; #pragma omp simd for (i = 0; i < N; i += 1) { b[i] = cos (a[i]); } return (0); }
being built by GCC 4.9.0 with the following command:
gcc ./cos.c O1 fopenmp ffastmath lm mavx2
produces binary with call to AVX2 version of vectorized cos (which name is _ZGVdN4v_cos) if Glibc installed systemwide with not disabled Libmvec. In the case of not systemwide installation use additional I/PATH_TO_GLIBC_INSTALL/include/ L/PATH_TO_GLIBC_INSTALL/lib/ options.
Example 2
Starting from Glibc 2.23 vector math functions can be used also w/o OpenMP with compilers supporting __attribute__ ((__simd__)), for instance with GCC 6.* (and later) with options ftreeloopvectorize ffastmath starting from optimization level O1.
The following code in file sin.c:
#include <math.h> int N = 3200; double b[3200]; double a[3200]; int main (void) { int i; for (i = 0; i < N; i += 1) { b[i] = sin (a[i]); } return (0); }
being built by GCC 6.* with the following command:
gcc ./sin.c O1 ftreeloopvectorize ffastmath lm mavx
produces binary with call to AVX version of vectorized sin (which name is _ZGVcN4v_sin) if Glibc installed systemwide with not disabled Libmvec (in the case of not systemwide installation additional options I/PATH_TO_GLIBC_INSTALL/include/ L/PATH_TO_GLIBC_INSTALL/lib/ needed).