This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [RFC] How to add vector math functions to Glibc


On Thu, 23 Oct 2014, Andrew Senkevich wrote:

> Let me know if such changes ok in general.

I'm not clear we yet reached consensus on whether glibc is the right place 
for this; I think that discussion tailed off without a clear conclusion, 
and someone needs to reread it, post a careful analysis of the discussion 
so far and try to help the community reach consensus.

Regarding the specific patch:

> +	      [Enable building and installing mathvec @<:@default=yes on x86_64 build, else default=no@:>@])],

I don't think the help text in an architecture-independent file should 
refer to specific architectures like this; just say "default depends on 
architecture" or similar.

> +ifeq ($(build-mathvect),yes)
> +# We need to install libm.so as linker script
> +# for more comfortable use of vector math library.
> +subdir_install: $(inst_libdir)/libm.so.tmp
> +$(inst_libdir)/libm.so.tmp: $(common-objpfx)format.lds \
> +	$(common-objpfx)math/libm.so$(libm.so-version) \
> +	$(common-objpfx)mathvec/libmvec.so$(libmvec.so-version) \
> +	$(+force)
> +	(echo '/* GNU ld script */';\
> +	cat $<; \
> +	echo 'GROUP ( $(slibdir)/libm.so$(libm.so-version) ' \
> +	'AS_NEEDED ( $(slibdir)/libmvec.so$(libmvec.so-version) ) )' \
> +	) > $@
> +	mv -f $@ $(inst_libdir)/libm.so # TODO do it somehow after all other
> +endif

Clearly it's necessary to resolve how to disable the normal installation 
rule for libm.so so it can be cleanly replaced by this new one.

> diff --git a/math/bits/mathcalls.h b/math/bits/mathcalls.h
> index 8a94a7e..2d31a11 100644
> --- a/math/bits/mathcalls.h
> +++ b/math/bits/mathcalls.h
> @@ -60,6 +60,15 @@ __MATHCALL (atan,, (_Mdouble_ __x));
>  __MATHCALL (atan2,, (_Mdouble_ __y, _Mdouble_ __x));
>  
>  /* Cosine of X.  */
> +#if !defined _Mfloat_ && !defined _Mlong_double_ && defined __DECL_SIMD_cos
> +__DECL_SIMD_cos
> +#endif
> +#if defined _Mfloat_ && !defined _Mlong_double_ && defined __DECL_SIMD_cosf
> +__DECL_SIMD_cosf
> +#endif
> +#if defined _Mlong_double_ && defined __DECL_SIMD_cosl
> +__DECL_SIMD_cosl
> +#endif
>  __MATHCALL (cos,, (_Mdouble_ __x));

As previously noted, I think it would be much better if the definition of 
__MATHCALL can include all the conditional bits (possibly through a 
generated header that defines __DECL_SIMD_cos etc. to empty if not defined 
by bits/math-vector.h).

> diff --git a/math/have_vector.h b/math/have_vector.h
> new file mode 100644
> index 0000000..94aacf0
> --- /dev/null
> +++ b/math/have_vector.h
> @@ -0,0 +1,2574 @@
> +/* 
> +Definitions below are generated with the following bash script:
> +for func in $(grep ALL_RM_TEST math/libm-test.inc | awk {'print $2'} | sed -e "s/(//" -e "s/,//"); do 

Rather than having such a file checked in, makefile rules / scripts to 
generate it at test time should be checked in.

> +static int avx2_usable;		/* Set to 1 if AVX2 supported */

Given that we expect multiple architectures to have vector functions, 
this belongs in some architecture-specific file that libm-test.inc 
includes, rather than directly in libm-test.inc (which shouldn't refer 
directly to AVX at all).

> -#define RUN_TEST_f_f(ARG_STR, FUNC_NAME, ARG, EXPECTED,			\
> -		     EXCEPTIONS)					\
> -  do									\
> -    if (enable_test (EXCEPTIONS))					\
> -      {									\
> -	COMMON_TEST_SETUP (ARG_STR);					\
> -	check_float (test_name, FUNC (FUNC_NAME) (ARG), EXPECTED,	\
> -		     EXCEPTIONS);					\
> -	COMMON_TEST_CLEANUP;						\
> -      }									\
> +#define RUN_TEST_f_f(ARG_STR, FUNC_NAME, ARG, EXPECTED,				\
> +		     EXCEPTIONS)						\
> +  do										\
> +    if (enable_test (EXCEPTIONS))						\
> +      {										\
> +	COMMON_TEST_SETUP (ARG_STR);						\
> +	check_float (test_name,							\
> +		     CONCAT (CONCAT3_1 (VEC_PREFIX_, FUNC_NAME, FUNC ( )),	\
> +			     FUNC (FUNC_NAME)) (ARG),				\
> +			     EXPECTED,						\
> +		     EXCEPTIONS);						\
> +	COMMON_TEST_CLEANUP;							\
> +      }										\

I think it would be better for FUNC to be defined, in the test file that 
includes libm-test.inc, in a way that avoids the need for the CONCAT* 
calls here.  (To avoid warnings / errors about undeclared functions, I 
suppose the generated header might then need to redefine e.g. vec_sin to 
sin if there isn't a vector version of sin.)

> +#if defined __x86_64__ && defined __FAST_MATH__
> +# if defined _OPENMP && _OPENMP >= 201307
> +/* OpenMP case. */
> +#  define __DECL_SIMD_AVX2 _Pragma("omp declare simd notinbranch")
> +#  define __DECL_SIMD_SSE4 _Pragma("omp declare simd notinbranch")

Of course we still need the API/ABI documentation providing the stable 
guarantee about exactly what this pragma means regarding the function 
versions it is saying are available in glibc.

> +#  define __DECL_SIMD_cos  __DECL_SIMD_AVX2
> +#  define __DECL_SIMD_cosf __DECL_SIMD_SSE4
> +# elif defined _CILKPLUS && _CILKPLUS >= 0
> +/* CilkPlus case. TODO _CILKPLUS currently nowhere defined */
> +#  define __DECL_SIMD_AVX2 __attribute__((__vector__(nomask)))
> +#  define __DECL_SIMD_SSE4 __attribute__((__vector__(processor(core_i7_sse4_2),\
> +						     nomask)))

And as previously noted, this needs to be fixed to be namespace-clean - 
using __nomask__, __processor__, __core_i7_sse4_2__.

> +#if defined TEST_MATHVEC

No, you can't have such conditionals on a macro in the user's namespace in 
an installed header.

> diff --git a/sysdeps/x86_64/configure.ac b/sysdeps/x86_64/configure.ac
> index c9f9a51..91c4cdf 100644
> --- a/sysdeps/x86_64/configure.ac
> +++ b/sysdeps/x86_64/configure.ac
> @@ -5,6 +5,24 @@ AC_CHECK_HEADER([cpuid.h], ,
>    [AC_MSG_ERROR([gcc must provide the <cpuid.h> header])],
>    [/* No default includes.  */])
>  
> +dnl Check if compiler target is x86_64.

Not needed.  preconfigure fragments in sysdeps directories need to check 
the architecture, but configure ones don't (they'll only be run for the 
relevant architecture, unless one fragment explicitly sources another).

> diff --git a/sysdeps/x86_64/fpu/Makefile b/sysdeps/x86_64/fpu/Makefile
> new file mode 100644
> index 0000000..d585fa0
> --- /dev/null
> +++ b/sysdeps/x86_64/fpu/Makefile
> @@ -0,0 +1,33 @@
> +ifeq ($(subdir),mathvec)
> +libmvec-support += svml_d_cos4_core svml_d_cos_data
> +endif
> +
> +# Rules for libmvec tests
> +ifeq ($(subdir),math)
> +ifneq ($(PERL),no)
> +ifeq ($(build-mathvec),yes)
> +libm-tests += test-double-vlen4 test-float-vlen8
> +
> +CFLAGS-test-double-vlen4-wrapper.c = -fno-inline -ffloat-store -fno-builtin \
> +				     -frounding-math -mavx2
> +CFLAGS-test-float-vlen8-wrapper.c = -fno-inline -ffloat-store -fno-builtin \
> +				    -frounding-math -mavx2

I think the sysdeps makefile should actually just define that double-vlen4 
and float-vlen8 are the vector lengths for which testing should take 
place, with all the other testing rules being arranged in an 
architecture-independent way.

> +/* General constants:
> + * lAbsMask
> + */
> +	.long	0xffffffff
> +	.long	0x7fffffff

My previous point from 
<https://sourceware.org/ml/libc-alpha/2014-10/msg00324.html> still applies 
about how to make these tables more readable (one line per "double" 
constant, more explicitly say what the constants are) and ensure that the 
offsets in the tables are directly linked to the offsets used in the 
function implementation.

> diff --git a/sysdeps/x86_64/fpu/test-double-vlen4-wrapper.c b/sysdeps/x86_64/fpu/test-double-vlen4-wrapper.c
> new file mode 100644
> index 0000000..0778e23
> --- /dev/null
> +++ b/sysdeps/x86_64/fpu/test-double-vlen4-wrapper.c

This file may well need to be architecture-specific, at least as written, 
but ...

> diff --git a/sysdeps/x86_64/fpu/test-double-vlen4.c b/sysdeps/x86_64/fpu/test-double-vlen4.c
> new file mode 100644
> index 0000000..4d3d9a3
> --- /dev/null
> +++ b/sysdeps/x86_64/fpu/test-double-vlen4.c

 ... it's not at all clear that this one should need to be.  At present it 
has some architecture-specific bits

> +#define CHECK_ARCH_EXT if (!avx2_usable) return;
> +
> +extern FLOAT WRAPPER_NAME (cos) (FLOAT);

but I'd think those are all that needs to go somewhere 
architecture-specific and the rest is pretty generic to any architecture 
supporting vector functions for vectors of 4 doubles.

-- 
Joseph S. Myers
joseph@codesourcery.com


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]