This is the mail archive of the
libc-alpha@sourceware.org
mailing list for the glibc project.
Re: [RFC] How to add vector math functions to Glibc
- From: "Joseph S. Myers" <joseph at codesourcery dot com>
- To: Andrew Senkevich <andrew dot n dot senkevich at gmail dot com>
- Cc: libc-alpha <libc-alpha at sourceware dot org>
- Date: Thu, 23 Oct 2014 21:37:40 +0000
- Subject: Re: [RFC] How to add vector math functions to Glibc
- Authentication-results: sourceware.org; auth=none
- References: <CAMXFM3tjquzniXP1weqxSVFJyhXqsf2PHuyrrrmqp7K0ZzORqA at mail dot gmail dot com> <54257507 dot 9070508 at redhat dot com> <CAMXFM3vOLspQtHxgJfD_Emht480w2RMbiwnEH6A_LhoS-JZFag at mail dot gmail dot com> <Pine dot LNX dot 4 dot 64 dot 1409301620020 dot 15186 at digraph dot polyomino dot org dot uk> <542AF92E dot 8090708 at lip6 dot fr> <Pine dot LNX dot 4 dot 64 dot 1409302003410 dot 12188 at digraph dot polyomino dot org dot uk> <CAMXFM3tuM_p6Acp4hzoQ2xzR=4BZqtw8NbezqY6h8V4Xx=5hUA at mail dot gmail dot com> <Pine dot LNX dot 4 dot 64 dot 1410021411420 dot 24886 at digraph dot polyomino dot org dot uk> <CAMXFM3uPiuJvSpgmt+8d0B1qh3QSA=TVx0ZExfojDVHzrscL8A at mail dot gmail dot com> <Pine dot LNX dot 4 dot 64 dot 1410091724031 dot 23641 at digraph dot polyomino dot org dot uk> <CAMXFM3uGOKqEAvGYew+9K7bmhObVmnP2u4kUOSh8_Cpwyk8s5g at mail dot gmail dot com> <Pine dot LNX dot 4 dot 64 dot 1410162141290 dot 32736 at digraph dot polyomino dot org dot uk> <CAMXFM3tf3EE-WenvnVohkFCkHR9Dy4QfaSiSHcw6Q-1o9QGW_Q at mail dot gmail dot com> <Pine dot LNX dot 4 dot 64 dot 1410211517340 dot 31277 at digraph dot polyomino dot org dot uk> <CAMXFM3t8xjQ7A6TF1SpY0GwzKpTQ9hZej4vKvSYHnnsxGkaONA at mail dot gmail dot com>
On Thu, 23 Oct 2014, Andrew Senkevich wrote:
> Let me know if such changes ok in general.
I'm not clear we yet reached consensus on whether glibc is the right place
for this; I think that discussion tailed off without a clear conclusion,
and someone needs to reread it, post a careful analysis of the discussion
so far and try to help the community reach consensus.
Regarding the specific patch:
> + [Enable building and installing mathvec @<:@default=yes on x86_64 build, else default=no@:>@])],
I don't think the help text in an architecture-independent file should
refer to specific architectures like this; just say "default depends on
architecture" or similar.
> +ifeq ($(build-mathvect),yes)
> +# We need to install libm.so as linker script
> +# for more comfortable use of vector math library.
> +subdir_install: $(inst_libdir)/libm.so.tmp
> +$(inst_libdir)/libm.so.tmp: $(common-objpfx)format.lds \
> + $(common-objpfx)math/libm.so$(libm.so-version) \
> + $(common-objpfx)mathvec/libmvec.so$(libmvec.so-version) \
> + $(+force)
> + (echo '/* GNU ld script */';\
> + cat $<; \
> + echo 'GROUP ( $(slibdir)/libm.so$(libm.so-version) ' \
> + 'AS_NEEDED ( $(slibdir)/libmvec.so$(libmvec.so-version) ) )' \
> + ) > $@
> + mv -f $@ $(inst_libdir)/libm.so # TODO do it somehow after all other
> +endif
Clearly it's necessary to resolve how to disable the normal installation
rule for libm.so so it can be cleanly replaced by this new one.
> diff --git a/math/bits/mathcalls.h b/math/bits/mathcalls.h
> index 8a94a7e..2d31a11 100644
> --- a/math/bits/mathcalls.h
> +++ b/math/bits/mathcalls.h
> @@ -60,6 +60,15 @@ __MATHCALL (atan,, (_Mdouble_ __x));
> __MATHCALL (atan2,, (_Mdouble_ __y, _Mdouble_ __x));
>
> /* Cosine of X. */
> +#if !defined _Mfloat_ && !defined _Mlong_double_ && defined __DECL_SIMD_cos
> +__DECL_SIMD_cos
> +#endif
> +#if defined _Mfloat_ && !defined _Mlong_double_ && defined __DECL_SIMD_cosf
> +__DECL_SIMD_cosf
> +#endif
> +#if defined _Mlong_double_ && defined __DECL_SIMD_cosl
> +__DECL_SIMD_cosl
> +#endif
> __MATHCALL (cos,, (_Mdouble_ __x));
As previously noted, I think it would be much better if the definition of
__MATHCALL can include all the conditional bits (possibly through a
generated header that defines __DECL_SIMD_cos etc. to empty if not defined
by bits/math-vector.h).
> diff --git a/math/have_vector.h b/math/have_vector.h
> new file mode 100644
> index 0000000..94aacf0
> --- /dev/null
> +++ b/math/have_vector.h
> @@ -0,0 +1,2574 @@
> +/*
> +Definitions below are generated with the following bash script:
> +for func in $(grep ALL_RM_TEST math/libm-test.inc | awk {'print $2'} | sed -e "s/(//" -e "s/,//"); do
Rather than having such a file checked in, makefile rules / scripts to
generate it at test time should be checked in.
> +static int avx2_usable; /* Set to 1 if AVX2 supported */
Given that we expect multiple architectures to have vector functions,
this belongs in some architecture-specific file that libm-test.inc
includes, rather than directly in libm-test.inc (which shouldn't refer
directly to AVX at all).
> -#define RUN_TEST_f_f(ARG_STR, FUNC_NAME, ARG, EXPECTED, \
> - EXCEPTIONS) \
> - do \
> - if (enable_test (EXCEPTIONS)) \
> - { \
> - COMMON_TEST_SETUP (ARG_STR); \
> - check_float (test_name, FUNC (FUNC_NAME) (ARG), EXPECTED, \
> - EXCEPTIONS); \
> - COMMON_TEST_CLEANUP; \
> - } \
> +#define RUN_TEST_f_f(ARG_STR, FUNC_NAME, ARG, EXPECTED, \
> + EXCEPTIONS) \
> + do \
> + if (enable_test (EXCEPTIONS)) \
> + { \
> + COMMON_TEST_SETUP (ARG_STR); \
> + check_float (test_name, \
> + CONCAT (CONCAT3_1 (VEC_PREFIX_, FUNC_NAME, FUNC ( )), \
> + FUNC (FUNC_NAME)) (ARG), \
> + EXPECTED, \
> + EXCEPTIONS); \
> + COMMON_TEST_CLEANUP; \
> + } \
I think it would be better for FUNC to be defined, in the test file that
includes libm-test.inc, in a way that avoids the need for the CONCAT*
calls here. (To avoid warnings / errors about undeclared functions, I
suppose the generated header might then need to redefine e.g. vec_sin to
sin if there isn't a vector version of sin.)
> +#if defined __x86_64__ && defined __FAST_MATH__
> +# if defined _OPENMP && _OPENMP >= 201307
> +/* OpenMP case. */
> +# define __DECL_SIMD_AVX2 _Pragma("omp declare simd notinbranch")
> +# define __DECL_SIMD_SSE4 _Pragma("omp declare simd notinbranch")
Of course we still need the API/ABI documentation providing the stable
guarantee about exactly what this pragma means regarding the function
versions it is saying are available in glibc.
> +# define __DECL_SIMD_cos __DECL_SIMD_AVX2
> +# define __DECL_SIMD_cosf __DECL_SIMD_SSE4
> +# elif defined _CILKPLUS && _CILKPLUS >= 0
> +/* CilkPlus case. TODO _CILKPLUS currently nowhere defined */
> +# define __DECL_SIMD_AVX2 __attribute__((__vector__(nomask)))
> +# define __DECL_SIMD_SSE4 __attribute__((__vector__(processor(core_i7_sse4_2),\
> + nomask)))
And as previously noted, this needs to be fixed to be namespace-clean -
using __nomask__, __processor__, __core_i7_sse4_2__.
> +#if defined TEST_MATHVEC
No, you can't have such conditionals on a macro in the user's namespace in
an installed header.
> diff --git a/sysdeps/x86_64/configure.ac b/sysdeps/x86_64/configure.ac
> index c9f9a51..91c4cdf 100644
> --- a/sysdeps/x86_64/configure.ac
> +++ b/sysdeps/x86_64/configure.ac
> @@ -5,6 +5,24 @@ AC_CHECK_HEADER([cpuid.h], ,
> [AC_MSG_ERROR([gcc must provide the <cpuid.h> header])],
> [/* No default includes. */])
>
> +dnl Check if compiler target is x86_64.
Not needed. preconfigure fragments in sysdeps directories need to check
the architecture, but configure ones don't (they'll only be run for the
relevant architecture, unless one fragment explicitly sources another).
> diff --git a/sysdeps/x86_64/fpu/Makefile b/sysdeps/x86_64/fpu/Makefile
> new file mode 100644
> index 0000000..d585fa0
> --- /dev/null
> +++ b/sysdeps/x86_64/fpu/Makefile
> @@ -0,0 +1,33 @@
> +ifeq ($(subdir),mathvec)
> +libmvec-support += svml_d_cos4_core svml_d_cos_data
> +endif
> +
> +# Rules for libmvec tests
> +ifeq ($(subdir),math)
> +ifneq ($(PERL),no)
> +ifeq ($(build-mathvec),yes)
> +libm-tests += test-double-vlen4 test-float-vlen8
> +
> +CFLAGS-test-double-vlen4-wrapper.c = -fno-inline -ffloat-store -fno-builtin \
> + -frounding-math -mavx2
> +CFLAGS-test-float-vlen8-wrapper.c = -fno-inline -ffloat-store -fno-builtin \
> + -frounding-math -mavx2
I think the sysdeps makefile should actually just define that double-vlen4
and float-vlen8 are the vector lengths for which testing should take
place, with all the other testing rules being arranged in an
architecture-independent way.
> +/* General constants:
> + * lAbsMask
> + */
> + .long 0xffffffff
> + .long 0x7fffffff
My previous point from
<https://sourceware.org/ml/libc-alpha/2014-10/msg00324.html> still applies
about how to make these tables more readable (one line per "double"
constant, more explicitly say what the constants are) and ensure that the
offsets in the tables are directly linked to the offsets used in the
function implementation.
> diff --git a/sysdeps/x86_64/fpu/test-double-vlen4-wrapper.c b/sysdeps/x86_64/fpu/test-double-vlen4-wrapper.c
> new file mode 100644
> index 0000000..0778e23
> --- /dev/null
> +++ b/sysdeps/x86_64/fpu/test-double-vlen4-wrapper.c
This file may well need to be architecture-specific, at least as written,
but ...
> diff --git a/sysdeps/x86_64/fpu/test-double-vlen4.c b/sysdeps/x86_64/fpu/test-double-vlen4.c
> new file mode 100644
> index 0000000..4d3d9a3
> --- /dev/null
> +++ b/sysdeps/x86_64/fpu/test-double-vlen4.c
... it's not at all clear that this one should need to be. At present it
has some architecture-specific bits
> +#define CHECK_ARCH_EXT if (!avx2_usable) return;
> +
> +extern FLOAT WRAPPER_NAME (cos) (FLOAT);
but I'd think those are all that needs to go somewhere
architecture-specific and the rest is pretty generic to any architecture
supporting vector functions for vectors of 4 doubles.
--
Joseph S. Myers
joseph@codesourcery.com