3027
Comment:

3551

Deletions are marked like this.  Additions are marked like this. 
Line 2:  Line 2: 
Line 7:  Line 6: 
Vector math functions are vector variants of corresponding scalar math operations implemented using SIMD ISA extensions (e.g. SSE or AVX). They take packed vector arguments, perform the operation on each element of the packed vector argument, and return a packed vector result. Using vector math functions is faster than repeatedly calling the scalar math routines. However, these vector versions differ from the scalar analogues in accuracy and behavior on special values. Functions are optimized for performance in their respective domains if processing doesn’t incur special values like denormal values, over and underflow, and out of range. Special values processing is done in a scalar fashion via respective scalar routine calls. Additionally functions like trigonometric may resort to scalar processing of huge (or other) arguments that do not necessarily cause special values, but rather require different and less SIMDfriendly handling.  Vector math functions are vector variants of corresponding scalar math operations implemented using SIMD ISA extensions (e.g. SSE or AVX for x86_64). They take packed vector arguments, perform the operation on each element of the packed vector argument, and return a packed vector result. Using vector math functions is faster than repeatedly calling the scalar math routines. The library can be disabled with disablemathvec configure option. The library is linked in as needed when using lm (no need to specify lmvec explicitly). ==== Libmvec on x86_64 ==== Build and testing enabled by default. Vector versions differ from the scalar analogues in accuracy and behavior on special values. Functions are optimized for performance in their respective domains if processing doesn’t incur special values like denormal values, over and underflow, and out of range. Special values processing is done in a scalar fashion via respective scalar routine calls. Additionally functions like trigonometric may resort to scalar processing of huge (or other) arguments that do not necessarily cause special values, but rather require different and less SIMDfriendly handling. 
Line 13:  Line 21: 
a. Functions may not raise exceptions as required by C language standard. Functions may raise spurious exceptions. This is considered an artifact of SIMD processing and may be fixed in the future on the casebycase basis. b. Functions may not change errno in some of the required cases, e.g. if the SIMD friendly algorithm is done branchfree without a libm call for that value. This is done for performance reasons. c. As the implementation is dependent on libm, some accuracy and special case problems may be inherent to this fact. d. Functions do not guarantee fully correct results in computation modes different from roundtonearest one. 
a. Functions may not raise exceptions as required by C language standard. Functions may raise spurious exceptions. This is considered an artifact of SIMD processing and may be fixed in the future on the casebycase basis a. Functions may not change errno in some of the required cases, e.g. if the SIMD friendly algorithm is done branchfree without a libm call for that value. This is done for performance reasons a. As the implementation is dependent on libm, some accuracy and special case problems may be inherent to this fact a. Functions do not guarantee fully correct results in computation modes different from roundtonearest one 
Line 19:  Line 27: 
For x86_64 vector functions names are created based on #2.6. Vector Function Name Mangling from Vector ABI (attached) which has been discussed at X8664 System V Application Binary Interface mail list.  
Line 20:  Line 29: 
For x86_64 vector functions names are created based on #2.6. Vector Function Name Mangling from Vector ABI (attached) which has been discussed at X8664 System V Application Binary Interface mail list. ==== Usage model ==== Call to vector math function could be created by GCC (starting from version 4.9.0) if developer used OpenMP SIMD constructs and fopenmp ffastmath passed. ==== Example ==== 
==== Usage model and example for x86_64 ==== Use of the vector math function could be enabled with fopenmp ffastmath starting from optimization level O1 (for GCC starting from version 4.9.0) and appropriate usage of OpenMP SIMD constructs. The library is linked in as needed when using lm (no need to specify lmvec explicitly). 
Line 30:  Line 34: 
#pragma omp declare simd extern double cos(double); int N = 300; double b[300]; double a[300]; 
{{{ #include <math.h> 
Line 36:  Line 37: 
int main(void) {  int N = 3200; double b[3200]; double a[3200]; 
Line 38:  Line 41: 
int i; #pragma omp simd 
int main (void) { int i; 
Line 41:  Line 45: 
for (i = 0; i < N; i += 1) { b[i] = cos (a[i]); } return (0); 
#pragma omp simd for (i = 0; i < N; i += 1) { b[i] = cos (a[i]); } return (0); 
Line 47:  Line 54: 
being built by gcc 4.9.0 with the following command: gcc ./cos.c I/PATH_TO_GLIBC_INSTALL/include L/ PATH_TO_GLIBC_INSTALL/lib/ O1 fopenmp lm mavx2 produces binary with nm a.out  grep ZGV  }}} 
Line 49:  Line 56: 
U _ZGVdN4v_cos@@GLIBC_2.21  being built by gcc 4.9.0 with the following command: gcc ./cos.c O1 fopenmp ffastmath lm mavx2 produces binary with call to AVX2 version of vectorized cos (which name is _ZGVdN4v_cos) if Glibc installed systemwide with not disabled Libmvec. In the case of not systemwide installation use additional I/PATH_TO_GLIBC_INSTALL/include/ L/PATH_TO_GLIBC_INSTALL/lib/ options. 
Libmvec
Libmvec is vector math library added in Glibc 2.22.
Vector math library was added to support SIMD constructs of OpenMP4.0 (#2.8 in http://www.openmp.org/mpdocuments/OpenMP4.0.0.pdf) by adding vector implementations of vector math functions.
Vector math functions are vector variants of corresponding scalar math operations implemented using SIMD ISA extensions (e.g. SSE or AVX for x86_64). They take packed vector arguments, perform the operation on each element of the packed vector argument, and return a packed vector result. Using vector math functions is faster than repeatedly calling the scalar math routines.
The library can be disabled with disablemathvec configure option. The library is linked in as needed when using lm (no need to specify lmvec explicitly).
Libmvec on x86_64
Build and testing enabled by default.
Vector versions differ from the scalar analogues in accuracy and behavior on special values.
Functions are optimized for performance in their respective domains if processing doesn’t incur special values like denormal values, over and underflow, and out of range. Special values processing is done in a scalar fashion via respective scalar routine calls. Additionally functions like trigonometric may resort to scalar processing of huge (or other) arguments that do not necessarily cause special values, but rather require different and less SIMDfriendly handling.
These functions were tested (via reasonable random sampling) to pass 4ulp maximum relative error criterion on their domains in roundtonearest computation mode. Known limitations:
C99 compliance in terms of special values, errno:
 Functions may not raise exceptions as required by C language standard. Functions may raise spurious exceptions. This is considered an artifact of SIMD processing and may be fixed in the future on the casebycase basis
 Functions may not change errno in some of the required cases, e.g. if the SIMD friendly algorithm is done branchfree without a libm call for that value. This is done for performance reasons
 As the implementation is dependent on libm, some accuracy and special case problems may be inherent to this fact
 Functions do not guarantee fully correct results in computation modes different from roundtonearest one
Vector ABI
For x86_64 vector functions names are created based on #2.6. Vector Function Name Mangling from Vector ABI (attached) which has been discussed at X8664 System V Application Binary Interface mail list.
Usage model and example for x86_64
Use of the vector math function could be enabled with fopenmp ffastmath starting from optimization level O1 (for GCC starting from version 4.9.0) and appropriate usage of OpenMP SIMD constructs. The library is linked in as needed when using lm (no need to specify lmvec explicitly).
The next code in file cos.c:
#include <math.h> int N = 3200; double b[3200]; double a[3200]; int main (void) { int i; #pragma omp simd for (i = 0; i < N; i += 1) { b[i] = cos (a[i]); } return (0); }
being built by gcc 4.9.0 with the following command:
gcc ./cos.c O1 fopenmp ffastmath lm mavx2
produces binary with call to AVX2 version of vectorized cos (which name is _ZGVdN4v_cos) if Glibc installed systemwide with not disabled Libmvec. In the case of not systemwide installation use additional I/PATH_TO_GLIBC_INSTALL/include/ L/PATH_TO_GLIBC_INSTALL/lib/ options.