This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH 1/N] x86_64 vectorization support: vectorized math functions addition to Glibc


On Wed, Sep 17, 2014 at 01:56:06PM +0400, Andrew Senkevich wrote:
> > The wiki says:
> >
> > 3.1. Goal
> >
> > Main goal is to improve vectorization of GCC with OpenMP4.0 SIMD
> > constructs (#2.8 in http://www.openmp.org/mp-documents/OpenMP4.0.0.pdf
> > and Cilk Plus constructs (6-7 in
> > http://www.cilkplus.org/sites/default/files/open_specifications/Intel_Cilk_plus_lang_spec_1.2.htm)
> > on x86_64 by adding SSE4, AVX and AVX2 vector implementations of
> > several vector math functions (float and double versions). AVX-512
> > versions are planned to be added later. These functions can be also
> > used manually (with intrincics) by developers to obtain speedup.
> >
> > It is the opposite of
> >
> > https://sourceware.org/ml/libc-alpha/2014-09/msg00277.html
> >
> > which is for programmers to use them directly in their
> > applications, mostly independent of compilers.
> >
> > We need to come to an agreement on what goal is first.
> >
> > --
> > H.J.
> 
> Hi H.J.,
> 
> of course the first goal is to improve vectorization. Usage with
> intrinsics is additional goal and is not very significant.
> 
> Attached first patch corrected according last comments in
> https://sourceware.org/ml/libc-alpha/2014-09/msg00182.html.

--- a/math/bits/mathcalls.h
+++ b/math/bits/mathcalls.h
@@ -46,6 +46,17 @@
 # error "Never include <bits/mathcalls.h> directly; include <math.h> instead."
 #endif
 
+#undef __DECL_SIMD
+
+/* For now we have vectorized version only for _Mdouble_ case */
+#if !defined _Mfloat_ && !defined _Mlong_double_
+# if defined _OPENMP && _OPENMP >= 201307
+#  define __DECL_SIMD _Pragma ("omp declare simd")

As the function is provided only on x86_64, it needs to be guarded
by defined __x86_64__ too (or have some way how arch specific
headers can tell what function are elemental).

Also, only the N (notinbranch) version is provided, so you'd
need to use "omp declare simd notinbranch", and furthermore only
the AVX2 version is provided (that is not possible for gcc,
you need all of SSE2, AVX and AVX2 versions, the other two can be
thunked (extract arguments and call cos in a loop or similarly, then
pass result in vector reg again).

	Jakub


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]