This is the mail archive of the
libc-alpha@sourceware.org
mailing list for the glibc project.
Re: [PATCH 1/N] x86_64 vectorization support: vectorized math functions addition to Glibc
- From: Jakub Jelinek <jakub at redhat dot com>
- To: Andrew Senkevich <andrew dot n dot senkevich at gmail dot com>
- Cc: "H.J. Lu" <hjl dot tools at gmail dot com>, "Carlos O'Donell" <carlos at redhat dot com>, "Joseph S. Myers" <joseph at codesourcery dot com>, libc-alpha <libc-alpha at sourceware dot org>, "Zamyatin, Igor" <igor dot zamyatin at intel dot com>, "Melik-Adamyan, Areg" <areg dot melik-adamyan at intel dot com>
- Date: Wed, 17 Sep 2014 12:08:49 +0200
- Subject: Re: [PATCH 1/N] x86_64 vectorization support: vectorized math functions addition to Glibc
- Authentication-results: sourceware.org; auth=none
- References: <CAMXFM3t=ppndDUBzHzSus7xyuF5hTaLFZ5b273jD39NtddSvsw at mail dot gmail dot com> <Pine dot LNX dot 4 dot 64 dot 1409101549490 dot 12853 at digraph dot polyomino dot org dot uk> <5411F8D3 dot 7050001 at redhat dot com> <CAMXFM3vEbTO1ntx7KOAG21axosPApTG6vwpcnu7B4VVATD+USw at mail dot gmail dot com> <CAMe9rOqFmwMWYBSsg9gPNeB_nskWZMSpzeWwc=YomsTNzjCn1A at mail dot gmail dot com> <CAMXFM3uNrRrAHDdS0LnbRZ7QwEFv1yd25cu1Ht2NC8fMBxLsBA at mail dot gmail dot com>
- Reply-to: Jakub Jelinek <jakub at redhat dot com>
On Wed, Sep 17, 2014 at 01:56:06PM +0400, Andrew Senkevich wrote:
> > The wiki says:
> >
> > 3.1. Goal
> >
> > Main goal is to improve vectorization of GCC with OpenMP4.0 SIMD
> > constructs (#2.8 in http://www.openmp.org/mp-documents/OpenMP4.0.0.pdf
> > and Cilk Plus constructs (6-7 in
> > http://www.cilkplus.org/sites/default/files/open_specifications/Intel_Cilk_plus_lang_spec_1.2.htm)
> > on x86_64 by adding SSE4, AVX and AVX2 vector implementations of
> > several vector math functions (float and double versions). AVX-512
> > versions are planned to be added later. These functions can be also
> > used manually (with intrincics) by developers to obtain speedup.
> >
> > It is the opposite of
> >
> > https://sourceware.org/ml/libc-alpha/2014-09/msg00277.html
> >
> > which is for programmers to use them directly in their
> > applications, mostly independent of compilers.
> >
> > We need to come to an agreement on what goal is first.
> >
> > --
> > H.J.
>
> Hi H.J.,
>
> of course the first goal is to improve vectorization. Usage with
> intrinsics is additional goal and is not very significant.
>
> Attached first patch corrected according last comments in
> https://sourceware.org/ml/libc-alpha/2014-09/msg00182.html.
--- a/math/bits/mathcalls.h
+++ b/math/bits/mathcalls.h
@@ -46,6 +46,17 @@
# error "Never include <bits/mathcalls.h> directly; include <math.h> instead."
#endif
+#undef __DECL_SIMD
+
+/* For now we have vectorized version only for _Mdouble_ case */
+#if !defined _Mfloat_ && !defined _Mlong_double_
+# if defined _OPENMP && _OPENMP >= 201307
+# define __DECL_SIMD _Pragma ("omp declare simd")
As the function is provided only on x86_64, it needs to be guarded
by defined __x86_64__ too (or have some way how arch specific
headers can tell what function are elemental).
Also, only the N (notinbranch) version is provided, so you'd
need to use "omp declare simd notinbranch", and furthermore only
the AVX2 version is provided (that is not possible for gcc,
you need all of SSE2, AVX and AVX2 versions, the other two can be
thunked (extract arguments and call cos in a loop or similarly, then
pass result in vector reg again).
Jakub