This is the mail archive of the
libc-alpha@sourceware.org
mailing list for the glibc project.
Re: [PATCH 1/2] PPC64: Add libmvec SIMD single-precision power function.
- From: Tulio Magno Quites Machado Filho <tuliom at ascii dot art dot br>
- To: Shawn Landden <shawn at git dot icu>, libc-alpha at sourceware dot org
- Cc: Joseph Myers <joseph at codesourcery dot com>
- Cc:
- Date: Wed, 03 Jul 2019 16:08:50 -0300
- Subject: Re: [PATCH 1/2] PPC64: Add libmvec SIMD single-precision power function.
- References: <20190625175302.26676-1-shawn@git.icu>
Shawn Landden <shawn@git.icu> writes:
> 2019-05-11 Shawn Landden <shawn@git.icu>
>
> [BZ #24210]
> * NEWS: Noted the addition of PPC64 vector powf function.
> * sysdeps/powerpc/bits/math-vector.h: Added entry for vector powf.
> * sysdeps/powerpc/powerpc64/fpu/Versions: Added vector powf entry.
> * sysdeps/powerpc/powerpc64/fpu/multiarch/Makefile:
> (libmvec-sysdep_routines, CFLAGS-vec_s_powf4_vsx.c):
> (CFLAGS-s_powf_log2_data.c): Added build of VSX SIMD powf
> function and tests.
> * sysdeps/powerpc/powerpc64/fpu/multiarch/vec_math_errf.c: expanded
> to include all floating point exceptions.
> * sysdeps/powerpc/powerpc64/fpu/multiarch/test-float-vlen4-wrappers.c:
> Added entry for vector powf.
> * sysdeps/powerpc/powerpc64/fpu/multiarch/vec_s_powf4_vsx.c: New file.
> * sysdeps/powerpc/powerpc64/fpu/multiarch/s_powf_log2_data.c: Likewise.
> * sysdeps/unix/sysv/linux/powerpc/powerpc64/libmvec.abilist: SIMD powf
> added.
Replace 8 spaces with tabs
> diff --git a/sysdeps/powerpc/powerpc64/fpu/multiarch/Makefile b/sysdeps/powerpc/powerpc64/fpu/multiarch/Makefile
> index dd982bb068..25d29b9a4a 100644
> --- a/sysdeps/powerpc/powerpc64/fpu/multiarch/Makefile
> +++ b/sysdeps/powerpc/powerpc64/fpu/multiarch/Makefile
> @@ -49,6 +49,7 @@ libmvec-sysdep_routines += vec_d_cos2_vsx vec_s_cosf4_vsx \
> vec_d_log2_vsx vec_d_log_data \
> vec_s_logf4_vsx vec_s_logf_data \
> vec_s_expf4_vsx vec_s_exp2f_data \
> + vec_s_powf4_vsx s_powf_log2_data \
Wrong indentation. Need to replace 8 spaces with a tab.
I also suggest to replace s_powf_log2_data with e_powf_log2_data.
More information about this later.
> vec_math_errf vec_math_err \
> vec_d_exp2_vsx vec_d_exp_data \
> vec_d_sincos2_vsx vec_s_sincosf4_vsx
> @@ -67,6 +68,7 @@ CFLAGS-vec_s_expf4_vsx.c += -mabi=altivec -maltivec -mvsx
> CFLAGS-vec_s_exp2f_data.c += -mabi=altivec -maltivec -mvsx
> CFLAGS-vec_d_exp2_vsx.c += -mabi=altivec -maltivec -mvsx -mpower8-vector
> CFLAGS-vec_d_exp_data.c += -mabi=altivec -maltivec -mvsx
> +CFLAGS-vec_s_powf4_vsx.c += -mabi=altivec -maltivec -mvsx
This file uses 64-bit vector built-ins that are only available on POWER8.
So, it also requires -mpower8-vector.
> diff --git a/sysdeps/powerpc/powerpc64/fpu/multiarch/s_powf_log2_data.c b/sysdeps/powerpc/powerpc64/fpu/multiarch/s_powf_log2_data.c
> new file mode 100644
> index 0000000000..542cde0fb1
> --- /dev/null
> +++ b/sysdeps/powerpc/powerpc64/fpu/multiarch/s_powf_log2_data.c
> @@ -0,0 +1,45 @@
> +/* Data definition for powf.
> + Copyright (C) 2017-2019 Free Software Foundation, Inc.
> + This file is part of the GNU C Library.
> +
> + The GNU C Library is free software; you can redistribute it and/or
> + modify it under the terms of the GNU Lesser General Public
> + License as published by the Free Software Foundation; either
> + version 2.1 of the License, or (at your option) any later version.
> +
> + The GNU C Library is distributed in the hope that it will be useful,
> + but WITHOUT ANY WARRANTY; without even the implied warranty of
> + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
> + Lesser General Public License for more details.
> +
> + You should have received a copy of the GNU Lesser General Public
> + License along with the GNU C Library; if not, see
> + <http://www.gnu.org/licenses/>. */
> +/* This file is identical to sysdeps/ieee754/flt-32/e_powf_log2_data.c */
If both files are identical, can't we reuse e_powf_log2_data.c without
duplicating the file?
That's why I suggested to use it in the Makefile earlier in this message.
> diff --git a/sysdeps/powerpc/powerpc64/fpu/multiarch/vec_s_powf4_vsx.c b/sysdeps/powerpc/powerpc64/fpu/multiarch/vec_s_powf4_vsx.c
> new file mode 100644
> index 0000000000..1ed9e1b450
> --- /dev/null
> +++ b/sysdeps/powerpc/powerpc64/fpu/multiarch/vec_s_powf4_vsx.c
> @@ -0,0 +1,311 @@
> ...
> +/* Subnormal input is normalized so ix has negative biased exponent.
> + Output is multiplied by N (POWF_SCALE) if TOINT_INTRINICS is set. */
> +struct two_v_doubles {
> + vector double l;
> + vector double r;
> +};
> +
> +static inline struct two_v_doubles log2_inline(vector unsigned ix)
Split this in 2 lines, i.e.:
static inline struct two_v_doubles
log2_inline (vector unsigned ix)
This needs to be fixed for other functions in this file too.
> +/* The output of log2 and thus the input of exp2 is either scaled by N
> + (in case of fast toint intrinsics) or not. The unscaled xd must be
> + in [-1021,1023], sign_bias sets the sign of the result. */
> +static inline vector float exp2_inline (vector double xdl, vector double xdr, vector unsigned sign_bias)
Likewise.
> + // TODO check if these need to be reversed on big-endian
> + sign_biasl = (v64u)vec_mergeh (sign_bias, zero);
> + sign_biasr = (v64u)vec_mergel (sign_bias, zero);
This could be causing the issue I'm seeing on big endian:
testing float (vector length 4)
Failure: Test: pow_vlen4 (-0x1p+0, -0xf.fffffp+20)
Result:
is: 1.00000000e+00 0x1.000000p+0
should be: -1.00000000e+00 -0x1.000000p+0
difference: 2.00000000e+00 0x1.000000p+1
ulp : 16777216.0000
max.ulp : 0.0000
Failure: Test: pow_vlen4 (-0x1p+0, 0xf.fffffp+20)
Result:
is: 1.00000000e+00 0x1.000000p+0
should be: -1.00000000e+00 -0x1.000000p+0
difference: 2.00000000e+00 0x1.000000p+1
ulp : 16777216.0000
max.ulp : 0.0000
Failure: Test: pow_vlen4 (-0x2.f58f8p+4, 0x1.7p+4)
Result:
is: 3.40282347e+38 0x1.fffffep+127
should be: -3.40282347e+38 -0x1.fffffep+127
difference: inf inf
ulp : inf
max.ulp : 0.0000
...
Test suite completed:
894 test cases plus 0 tests for exception flags and
0 tests for errno executed.
8 errors occurred.
> +/* Returns 0 if not int, 1 if odd int, 2 if even int. The argument is
> + the bit representation of a non-zero finite floating-point value. */
> +static inline vector unsigned checkint(vector unsigned iy)
Likewise.
> + not_matched &= ~is_second;
> + res = vec_sel (res, zero + 2, is_second);
> + vector unsigned is_third = (vector unsigned)vec_cmpne (iy & ((1 << (0x7f + 23 - e)) - 1), zero);
vec_cmpne is only available on GCC >= 7.
Don't worry, though. I already have a fix that I plan to merge soon.
--
Tulio Magno