This is the mail archive of the
mailing list for the glibc project.
Re: [PATCH 1/N] x86_64 vectorization support: vectorized math functions addition to Glibc
- From: "Joseph S. Myers" <joseph at codesourcery dot com>
- To: Andrew Senkevich <andrew dot n dot senkevich at gmail dot com>
- Cc: libc-alpha <libc-alpha at sourceware dot org>, <igor dot zamyatin at intel dot com>, "Melik-Adamyan, Areg" <areg dot melik-adamyan at intel dot com>, <jakub at redhat dot com>
- Date: Wed, 10 Sep 2014 16:07:53 +0000
- Subject: Re: [PATCH 1/N] x86_64 vectorization support: vectorized math functions addition to Glibc
- Authentication-results: sourceware.org; auth=none
- References: <CAMXFM3t=ppndDUBzHzSus7xyuF5hTaLFZ5b273jD39NtddSvsw at mail dot gmail dot com>
On Wed, 10 Sep 2014, Andrew Senkevich wrote:
> Hi all,
> this is the first patch in the series of patches which will add Intel
> vectorized math functions to Glibc.
Please start by raising general design questions on libc-alpha before
sending any patches; only send patches once there is consensus on the
general questions and that consensus has been written up on a wiki page.
* Should functions go in libm or a separate libmvec library?
* What requirements on the compiler / assembler versions used are imposed
by the requirement that the ABI provided by glibc's shared libraries must
not depend on the tools used to build glibc, and what such requirements is
it OK to impose (it may be OK to move to GCC 4.6 as minimum compiler at
present, but requiring a more recent version would be a problem; we'd need
to consider what binutils version we can require)? If a separate libmvec
is used, is it OK simply not to build it if those requirements aren't met?
(It's definitely not OK for the ABI of a library to vary incompatibly, but
it might be OK for the presence of a library to be conditional.)
* Should it be declared that these vectorized functions do not set errno?
(If so, then any header code that enables them to be used must of course
avoid enabling them in the default -fmath-errno case.) Similarly, do they
follow the other goals documented in the glibc manual for accuracy of
results and exceptions (for all input values, including e.g. range
reduction)? If not, further conditionals such as -ffast-math may be
* How do we handle different glibc versions having vectorized functions
for different vector ISA extensions? You're using a single __DECL_SIMD,
and having such a function only for AVX2. But one glibc version could
have a function vectorized for ISA extensions A and B, with another
version adding it vectorized for C. The compiler the user uses with the
installed glibc headers must be able to tell from those headers which
functions have what vectorized versions. That is, if a glibc version is
released where _Pragma ("omp declare simd") is used with a function that
only has an AVX2 vectorized version, no past or future GCC version can
interpret that pragma as meaning that any version other than AVX2 is
available (it must be possible to use any installed glibc headers with
both past and future compilers).
* Similarly, we need to handle different architectures having different
sets of functions vectorized and possibly not having the same set of
vector ISAs for each function. Maybe this suggests having an
architecture-specific bits/ header that, for each function that might be
vectorized, defines a macro to tell the compiler what vector versions are
#define __DECL_SIMD_COS_FLOAT /* empty */
#define __DECL_SIMD_COS_DOUBLE __DECL_SIMD_AVX2
#define __DECL_SIMD_COS_LONG_DOUBLE /* empty */
where the declaration of cos automatically uses __DECL_SIMD_COS_DOUBLE,
and __DECL_SIMD_AVX2 expands to a directive whose semantics are agreed
with compilers to mean that an AVX2 vectorized version of the function is
available (but other vectorized versions may not be).
Obviously new functions go at new symbol versions (so GLIBC_2.21 at
present, not GLIBC_2.2.5 with a completely inappropriate Versions comment
in your patch). I'd expect you to need appropriate section directives for
the data table you add to ensure it goes in read-only data, not writable.
And you shouldn't be adding a local PLT reference to cos; call an internal
Joseph S. Myers