powerarchvectorfuncabi

Vector Function Application Binary Interface Specification for POWER Architecture

1. Vector Function ABI Overview

This Vector Function ABI provides the ABI for vector functions generated by GCC compilers supporting SIMD constructs of OpenMP 4.0 [1] and above. These SIMD constructs are also available without OpenMP in GCC compilers that implement the attribute ((simd)) for function declarations and definitions.

The ABI described here applies only for C/C++ functions.

Use of a SIMD construct for a function declaration or definition enables the creation of vector versions of the function from the scalar version of the function. The vector variants can be used to process multiple instances concurrently in a single invocation in a vector context (e.g., most typically in vectorizing loops during the optimization phase of compilation.)

For a function definition, use of #pragma omp declare simd or attribute ((simd)) enables creation of vector versions by the compiler.

For a function declaration, use of #pragma omp declare simd or attribute ((simd)) enables the compiler to know the exact list of available vector function implementations provided by a library. The library's vector functions will use the OpenMP pragma or GCC attribute SIMD constructs in their prototypes.

The Vector Function ABI defines a set of rules that caller and callee functions must obey. The rules consist of:

Calling convention (how arguments are passed to the vector function and how values are returned from the vector function)
Vector length (the number of concurrent scalar invocations to be processed per invocation of the vector function)
Mapping from element data types to vector data types
Ordering of vector arguments
Vector function masking
Vector function name mangling
Compiler generated vector function variants

This specification shall be considered an extension to the OpenPOWER 64-bit ELF V2 ABI Specification for Power Architecture [2]. As such, it applies to Power ISA 2.07 and above, requiring the VSX vector extensions described in that ABI and ISA.

2. Vector Function ABI

2.1 Calling convention

The vector functions should use the calling convention described in Section 2.2, Function Calling Sequence, of OpenPOWER 64-bit ELF V2 ABI Specification for Power Architecture [2] document.

2.2 Vector Length

Every vector variant of a SIMD-enabled function has a vector length (VLEN). If OpenMP clause "simdlen" is used, the VLEN is the value of the argument of that clause. The VLEN value must be a power of 2. In the other cases (GCC simd attribute used or OpenMP simdlen not used) the notion of a function's "characteristic data type" (CDT) is used to compute the vector length. CDT is defined in the following order:

For non-void function, the CDT is the return type.
If the function has any non-uniform, non-linear parameters, then the CDT is the type of the first such parameter.
If the CDT determined by a) or b) above is a homogeneous aggregate (see "Parameter Passing in Registers" in [2]), the CDT is the entire homogeneous aggregate. For example, a parameter "double x[2]" has a CDT of type double[2] and size 16 bytes. The same applies for a complex double type.
If the CDT determined by a) or b) above is a nonhomogeneous struct, union, or class type (see "Parameter Passing in Registers" in [2]) which is pass-by-value, the characteristic data type is int.
If none of the above three cases is applicable, the CDT is int.

The VLEN is then determined based on the CDT and the size of the vector register for the ISA. VLEN is computed using the formula below: VLEN = sizeof(vector_register) / sizeof(CDT). VSX has sizeof(vector_register) = 16.

2.3 Mapping from element data type to vector data type

The vector data types for parameters are selected depending on ISA, vector length, data type of original parameter, and parameter specification. For uniform and linear parameters (detailed descriptions are found in [1]), the original data type is preserved. For vector parameters, vector data types are selected by the compiler. The mapping from element data type to vector data type is described below.

The bit size of the vector data type of a parameter is computed as:
size_of_vector_data_type = VLEN * sizeof(original_parameter_data_type) * 8
For instance, for a VSX vector function with parameter data type "int":
VLEN = 4, size_of_vector_data_type = 4 * 4 * 8 = 128 bits, which means one argument of type vector signed int.
If the size_of_vector_data_type is greater than the width of the vector register, multiple vector registers are used for passing the vector parameter. For instance, a VSX vector function with parameter data type of "double":
VLEN = 4, size_of_vector_data_type = 4 * 8 * 8 = 256 bits, the vector data type is vector double [2], which means 2 arguments of type vector double are to be passed.

2.4 Ordering of Vector Arguments

When a parameter in the original data type results in one argument in the vector function, the ordering rule is a simple one-to-one match with the original argument order.
For example, when the original argument list is (int a, float b, int c), VLEN is 4, and all a, b, and c are classified vector parameters, the vector function argument list becomes (vector int vec_a, vector float vec_b, vector int vec_c).
There are cases where a single parameter in the original data type results in multiple arguments in the vector function. Those additional second and subsequent arguments are inserted in the argument list right after the corresponding first argument, not appended to the end of the argument list of the vector function. For example, if the original argument list is (int a, double b, int c), VLEN is 4, and all a, b, and c are classified as vector parameters, the vector function argument list becomes (vector int vec_a, vector double vec_b1, vector double vec_b2, vector int vec_c).
For an example involving homogeneous aggregates, if the original argument list is (int a, double b[2], int c), VLEN is 4, and all a, b, and c are classified as vector parameters, the vector function argument list becomes (vector int vec_a, vector double vec_b0_0, vector double vec_b0_1, vector double vec_b1_0, vector double vec_b1_1, vector int vec_c).

2.5 Masking of Vector Functions

Masking of vector functions is not currently supported by the Power ISA. Compilers should not generate code for masked variants of vector functions until such time (if ever) as masked vector instructions are supported.