This is the mail archive of the
libc-alpha@sourceware.org
mailing list for the glibc project.
Re: [PATCH 1/N] x86_64 vectorization support: vectorized math functions addition to Glibc
- From: Andi Kleen <andi at firstfloor dot org>
- To: Rich Felker <dalias at libc dot org>
- Cc: Matthew Fortune <Matthew dot Fortune at imgtec dot com>, "Joseph S. Myers" <joseph at codesourcery dot com>, Andrew Senkevich <andrew dot n dot senkevich at gmail dot com>, libc-alpha <libc-alpha at sourceware dot org>, "igor dot zamyatin\ at intel dot com" <igor dot zamyatin at intel dot com>, "Melik-Adamyan\, Areg" <areg dot melik-adamyan at intel dot com>, "jakub\ at redhat dot com" <jakub at redhat dot com>
- Date: Thu, 11 Sep 2014 22:33:41 -0700
- Subject: Re: [PATCH 1/N] x86_64 vectorization support: vectorized math functions addition to Glibc
- Authentication-results: sourceware.org; auth=none
- References: <CAMXFM3t=ppndDUBzHzSus7xyuF5hTaLFZ5b273jD39NtddSvsw at mail dot gmail dot com> <Pine dot LNX dot 4 dot 64 dot 1409101549490 dot 12853 at digraph dot polyomino dot org dot uk> <6D39441BF12EF246A7ABCE6654B0235320F09D65 at LEMAIL01 dot le dot imgtec dot org> <20140911210246 dot GN23797 at brightrain dot aerifal dot cx>
Rich Felker <dalias@libc.org> writes:
>
> This really seems like something the compiler should be doing --
> translating parallelizable calls to the standard math functions into
> calls to special simd versions (
Of course gcc already supports that. Even in two different flavours.
Not sure why the patch doesn't implement one of those ABIs though.
-mveclibabi=type
Specifies the ABI type to use for vectorizing intrinsics
using an external library.
Supported values for type are svml for the Intel short vector
math library and acml for
the AMD math core library. To use this option, both
-ftree-vectorize and
-funsafe-math-optimizations have to be enabled, and an SVML
or ACML ABI-compatible
library must be specified at link time.
GCC currently emits calls to "vmldExp2", "vmldLn2",
"vmldLog102", "vmldLog102",
"vmldPow2", "vmldTanh2", "vmldTan2", "vmldAtan2",
"vmldAtanh2", "vmldCbrt2",
"vmldSinh2", "vmldSin2", "vmldAsinh2", "vmldAsin2",
"vmldCosh2", "vmldCos2",
"vmldAcosh2", "vmldAcos2", "vmlsExp4", "vmlsLn4",
"vmlsLog104", "vmlsLog104",
"vmlsPow4", "vmlsTanh4", "vmlsTan4", "vmlsAtan4",
"vmlsAtanh4", "vmlsCbrt4",
"vmlsSinh4", "vmlsSin4", "vmlsAsinh4", "vmlsAsin4",
"vmlsCosh4", "vmlsCos4",
"vmlsAcosh4" and "vmlsAcos4" for corresponding function type
when -mveclibabi=svml is
used, and "__vrd2_sin", "__vrd2_cos", "__vrd2_exp",
"__vrd2_log", "__vrd2_log2",
"__vrd2_log10", "__vrs4_sinf", "__vrs4_cosf", "__vrs4_expf",
"__vrs4_logf",
"__vrs4_log2f", "__vrs4_log10f" and "__vrs4_powf" for the
corresponding function type
when -mveclibabi=acml is used.
-Andi
--
ak@linux.intel.com -- Speaking for myself only