This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.
Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]
Re: [PATCH] aarch64: Implement math acceleration via builtins

From: Adhemerval Zanella <adhemerval dot zanella at linaro dot org>
To: libc-alpha at sourceware dot org
Date: Fri, 13 Oct 2017 19:36:54 -0300
Subject: Re: [PATCH] aarch64: Implement math acceleration via builtins
Authentication-results: sourceware.org; auth=none
References: <HE1PR0802MB2377FB89896B3EEAD493799595480@HE1PR0802MB2377.eurprd08.prod.outlook.com>

On 13/10/2017 18:26, Michael Collison wrote:
> This patch converts asm statements into builtin and neon intrinsics for AArch64.
> As an example for the file sysdeps/aarch64/fpu/_s_ceil.c, we convert the function from
> 
> double
> __ceil (double x)
> {
>   double result;
>   asm ("frintp\t%d0, %d1" :
>        "=w" (result) : "w" (x) );
>   return result;
> }
> 
> into
> 
> double
> __ceil (double x)
> {
>   return __builtin_ceil (x);
> }
> 
> Tested on aarch64-linux-gnu with gcc-4.9.4 and gcc-6. Okay for trunk?
> 
> 2017-10-13  Michael Collison  <michael.collison@arm.com>
> 
> 	* sysdeps/aarch64/fpu/e_sqrt.c (ieee754_sqrt): Replace
> 	asm statements with __builtin_sqrt.
> 	* sysdeps/aarch64/fpu/e_sqrtf.c (ieee754_sqrtf): Replace
> 	asm statements with __builtin_sqrtf.
> 	* sysdeps/aarch64/fpu/s_ceil.c (__ceil): Replace
> 	asm statements with __builtin_ceil.
> 	* sysdeps/aarch64/fpu/s_ceilf.c (__ceilf): Replace
> 	asm statements with __builtin_ceilf.
> 	* sysdeps/aarch64/fpu/s_floor.c (__floor): Replace
> 	asm statements with __builtin_floor.
> 	* sysdeps/aarch64/fpu/s_floorf.c (__floorf): Replace
> 	asm statements with __builtin_floorf.
> 	* sysdeps/aarch64/fpu/s_fma.c (__fma): Replace
> 	asm statements with __builtin_fma.
> 	* sysdeps/aarch64/fpu/s_fmaf.c (__fmaf): Replace
> 	asm statements with __builtin_fmaf.
> 	* sysdeps/aarch64/fpu/s_fmax.c (__fmax): Replace
> 	asm statements with __builtin_fmax.
> 	* sysdeps/aarch64/fpu/s_fmaxf.c (__fmaxf): Replace
> 	asm statements with __builtin_fmaxf.
> 	* sysdeps/aarch64/fpu/s_fmin.c (__fmin): Replace
> 	asm statements with __builtin_fmin.
> 	* sysdeps/aarch64/fpu/s_fminf.c (__fminf): Replace
> 	asm statements with __builtin_fminf.
> 	* sysdeps/aarch64/fpu/s_frint.c: Delete file.
> 	* sysdeps/aarch64/fpu/s_frintf.c: Delete file.
> 	* sysdeps/aarch64/fpu/s_llrint.c (__llrint): Replace
> 	asm statements with builtin and neon intrinsic.
> 	* sysdeps/aarch64/fpu/s_llrintf.c (__llrintf): Ditto.
> 	* sysdeps/aarch64/fpu/s_llround.c (__llround): Replace
> 	asm statements with neon intrinsic.
> 	* sysdeps/aarch64/fpu/s_llroundf.c (__llroundf): Ditto.
> 	* sysdeps/aarch64/fpu/s_lrint.c (__lrint): Replace
> 	asm statements with builtin and neon intrinsic.
> 	* sysdeps/aarch64/fpu/s_lrintf.c (__lrintf): Ditto.
> 	* sysdeps/aarch64/fpu/s_lround.c (__lround): Replace
> 	asm statements with neon intrinsic vcvtad_s64_f64.
> 	* sysdeps/aarch64/fpu/s_lroundf.c (__lroundf): Replace
> 	asm statements with neon intrinsic vcvtas_s32_f32.
> 	* sysdeps/aarch64/fpu/s_nearbyint.c (__nearbyint): Replace
> 	asm statements with __builtin_nearbyint.
> 	* sysdeps/aarch64/fpu/s_nearbyintf.c (__nearbyintf): Replace
> 	asm statements with __builtin_nearbyintf.
> 	* sysdeps/aarch64/fpu/s_rint.c (__rint): Replace
> 	asm statements with __builtin_rint.
> 	* sysdeps/aarch64/fpu/s_rintf.c (__rintf): Replace
> 	asm statements with __builtin_rintf.
> 	* sysdeps/aarch64/fpu/s_round.c (__round): Replace
> 	asm statements with __builtin_round.
> 	* sysdeps/aarch64/fpu/s_roundf.c (__roundf): Replace
> 	asm statements with __builtin_roundf.
> 	* sysdeps/aarch64/fpu/s_trunc.c (__trunc): Replace
> 	asm statements with __builtin_trunc.
> 	* sysdeps/aarch64/fpu/s_truncf.c (__truncf): Replace
> 	asm statements with __builtin_truncf.
> 	* sysdeps/aarch64/fpu/Makefile: Build e_sqrt[f].c with
> 	-fno-math-errno.

Looks good in general with some comments below.  Is there any advantage
of the refactor besides avoid inline assembly?

Since you refactoring it, I would also suggest to use builtin for the 
aarch64 match_private.h sqrt and get/set fpcr at fpu_control.h.

> 
> diff --git a/sysdeps/aarch64/fpu/Makefile b/sysdeps/aarch64/fpu/Makefile
> new file mode 100644
> index 0000000..4bd4e69
> --- /dev/null
> +++ b/sysdeps/aarch64/fpu/Makefile
> @@ -0,0 +1,4 @@
> +ifeq ($(subdir),math)
> +CFLAGS-e_sqrtf.c += -fno-math-errno
> +CFLAGS-e_sqrt.c += -fno-math-errno
> +endif
> diff --git a/sysdeps/aarch64/fpu/e_sqrt.c b/sysdeps/aarch64/fpu/e_sqrt.c
> index f984d87..b80ac27 100644
> --- a/sysdeps/aarch64/fpu/e_sqrt.c
> +++ b/sysdeps/aarch64/fpu/e_sqrt.c
> @@ -21,8 +21,6 @@
>  double
>  __ieee754_sqrt (double d)
>  {
> -  double res;
> -  asm ("fsqrt   %d0, %d1" : "=w" (res) : "w" (d));
> -  return res;
> +  return __builtin_sqrt (d);
>  }
>  strong_alias (__ieee754_sqrt, __sqrt_finite)
> diff --git a/sysdeps/aarch64/fpu/e_sqrtf.c b/sysdeps/aarch64/fpu/e_sqrtf.c
> index 67707ef..7380454 100644
> --- a/sysdeps/aarch64/fpu/e_sqrtf.c
> +++ b/sysdeps/aarch64/fpu/e_sqrtf.c
> @@ -21,8 +21,6 @@
>  float
>  __ieee754_sqrtf (float s)
>  {
> -  float res;
> -  asm ("fsqrt   %s0, %s1" : "=w" (res) : "w" (s));
> -  return res;
> +  return __builtin_sqrtf (s);
>  }
>  strong_alias (__ieee754_sqrtf, __sqrtf_finite)
> diff --git a/sysdeps/aarch64/fpu/s_ceil.c b/sysdeps/aarch64/fpu/s_ceil.c
> index d0a8bd8..bc90ab9 100644
> --- a/sysdeps/aarch64/fpu/s_ceil.c
> +++ b/sysdeps/aarch64/fpu/s_ceil.c
> @@ -16,6 +16,12 @@
>     License along with the GNU C Library; if not, see
>     <http://www.gnu.org/licenses/>.  */
>  
> -#define	FUNC ceil
> -#define INSN "frintp"
> -#include <s_frint.c>
> +#include <math.h>
> +
> +double
> +__ceil (double x)
> +{
> +  return __builtin_ceil (x);
> +}
> +
> +weak_alias (__ceil, ceil)
> diff --git a/sysdeps/aarch64/fpu/s_ceilf.c b/sysdeps/aarch64/fpu/s_ceilf.c
> index b9c2e7c..d5c4383 100644
> --- a/sysdeps/aarch64/fpu/s_ceilf.c
> +++ b/sysdeps/aarch64/fpu/s_ceilf.c
> @@ -16,6 +16,12 @@
>     License along with the GNU C Library; if not, see
>     <http://www.gnu.org/licenses/>.  */
>  
> -#define	FUNC ceilf
> -#define INSN "frintp"
> -#include <s_frintf.c>
> +#include <math.h>
> +
> +float
> +__ceilf (float x)
> +{
> +  return __builtin_ceilf (x);
> +}
> +
> +weak_alias (__ceilf, ceilf)
> diff --git a/sysdeps/aarch64/fpu/s_floor.c b/sysdeps/aarch64/fpu/s_floor.c
> index f7f8731..049535c 100644
> --- a/sysdeps/aarch64/fpu/s_floor.c
> +++ b/sysdeps/aarch64/fpu/s_floor.c
> @@ -16,6 +16,12 @@
>     License along with the GNU C Library; if not, see
>     <http://www.gnu.org/licenses/>.  */
>  
> -#define FUNC floor
> -#define INSN "frintm"
> -#include <s_frint.c>
> +#include <math.h>
> +
> +double
> +__floor (double x)
> +{
> +  return __builtin_floor (x);
> +}
> +
> +weak_alias (__floor, floor)
> diff --git a/sysdeps/aarch64/fpu/s_floorf.c b/sysdeps/aarch64/fpu/s_floorf.c
> index 7be63b5..fa6fa17 100644
> --- a/sysdeps/aarch64/fpu/s_floorf.c
> +++ b/sysdeps/aarch64/fpu/s_floorf.c
> @@ -16,6 +16,12 @@
>     License along with the GNU C Library; if not, see
>     <http://www.gnu.org/licenses/>.  */
>  
> -#define FUNC floorf
> -#define INSN "frintm"
> -#include <s_frintf.c>
> +#include <math.h>
> +
> +float
> +__floorf (float x)
> +{
> +  return __builtin_floorf (x);
> +}
> +
> +weak_alias (__floorf, floorf)
> diff --git a/sysdeps/aarch64/fpu/s_fma.c b/sysdeps/aarch64/fpu/s_fma.c
> index 6f62ce2..e496ec6 100644
> --- a/sysdeps/aarch64/fpu/s_fma.c
> +++ b/sysdeps/aarch64/fpu/s_fma.c
> @@ -18,28 +18,10 @@
>  
>  #include <math.h>
>  
> -#ifndef FUNC
> -# define FUNC fma
> -#endif
> -
> -#ifndef TYPE
> -# define TYPE double
> -# define REGS "d"
> -#else
> -# ifndef REGS
> -#  error REGS not defined
> -# endif
> -#endif
> -
> -#define __CONCATX(a,b) __CONCAT(a,b)
> -
> -TYPE
> -__CONCATX(__,FUNC) (TYPE x, TYPE y, TYPE z)
> +double
> +__fma (double x, double y, double z)
>  {
> -  TYPE result;
> -  asm ( "fmadd" "\t%" REGS "0, %" REGS "1, %" REGS "2, %" REGS "3"
> -        : "=w" (result) : "w" (x), "w" (y), "w" (z) );
> -  return result;
> +  return __builtin_fma (x, y, z);
>  }
>  
> -weak_alias (__CONCATX(__,FUNC), FUNC)
> +weak_alias (__fma, fma)
> diff --git a/sysdeps/aarch64/fpu/s_fmaf.c b/sysdeps/aarch64/fpu/s_fmaf.c
> index 880a22d..ff1abbf 100644
> --- a/sysdeps/aarch64/fpu/s_fmaf.c
> +++ b/sysdeps/aarch64/fpu/s_fmaf.c
> @@ -16,7 +16,12 @@
>     License along with the GNU C Library; if not, see
>     <http://www.gnu.org/licenses/>.  */
>  
> -#define FUNC fmaf
> -#define TYPE float
> -#define REGS "s"
> -#include <s_fma.c>
> +#include <math.h>
> +
> +float
> +__fmaf (float x, float y, float z)
> +{
> +  return __builtin_fmaf (x, y, z);
> +}
> +
> +weak_alias (__fmaf, fmaf)
> diff --git a/sysdeps/aarch64/fpu/s_fmax.c b/sysdeps/aarch64/fpu/s_fmax.c
> index 395a9ba..d7a82f8 100644
> --- a/sysdeps/aarch64/fpu/s_fmax.c
> +++ b/sysdeps/aarch64/fpu/s_fmax.c
> @@ -16,6 +16,12 @@
>     License along with the GNU C Library; if not, see
>     <http://www.gnu.org/licenses/>.  */
>  
> -#define FUNC fmax
> -#define INSN "fmaxnm"
> -#include <fpu/s_fmin.c>
> +#include <math.h>
> +
> +double
> +__fmax (double x, double y)
> +{
> +  return __builtin_fmax (x, y);
> +}
> +
> +weak_alias (__fmax, fmax)
> diff --git a/sysdeps/aarch64/fpu/s_fmaxf.c b/sysdeps/aarch64/fpu/s_fmaxf.c
> index f450d9f..ec4dcdd 100644
> --- a/sysdeps/aarch64/fpu/s_fmaxf.c
> +++ b/sysdeps/aarch64/fpu/s_fmaxf.c
> @@ -16,8 +16,12 @@
>     License along with the GNU C Library; if not, see
>     <http://www.gnu.org/licenses/>.  */
>  
> -#define FUNC fmaxf
> -#define INSN "fmaxnm"
> -#define TYPE float
> -#define REGS "s"
> -#include <fpu/s_fmin.c>
> +#include <math.h>
> +
> +float
> +__fmaxf (float x, float y)
> +{
> +  return __builtin_fmaxf (x, y);
> +}
> +
> +weak_alias (__fmaxf, fmaxf)
> diff --git a/sysdeps/aarch64/fpu/s_fmin.c b/sysdeps/aarch64/fpu/s_fmin.c
> index b6d32d5..bba894e 100644
> --- a/sysdeps/aarch64/fpu/s_fmin.c
> +++ b/sysdeps/aarch64/fpu/s_fmin.c
> @@ -18,32 +18,10 @@
>  
>  #include <math.h>
>  
> -#ifndef FUNC
> -# define FUNC fmin
> -#endif
> -
> -#ifndef INSN
> -# define INSN "fminnm"
> -#endif
> -
> -#ifndef TYPE
> -# define TYPE double
> -# define REGS "d"
> -#else
> -# ifndef REGS
> -#  error REGS not defined
> -# endif
> -#endif
> -
> -#define __CONCATX(a,b) __CONCAT(a,b)
> -
> -TYPE
> -__CONCATX(__,FUNC) (TYPE x, TYPE y)
> +double
> +__fmin (double x, double y)
>  {
> -  TYPE result;
> -  asm ( INSN "\t%" REGS "0, %" REGS "1, %" REGS "2"
> -        : "=w" (result) : "w" (x), "w" (y) );
> -  return result;
> +  return __builtin_fmin (x, y);
>  }
>  
> -weak_alias (__CONCATX(__,FUNC), FUNC)
> +weak_alias (__fmin, fmin)
> diff --git a/sysdeps/aarch64/fpu/s_fminf.c b/sysdeps/aarch64/fpu/s_fminf.c
> index 032262d..7d3a3a3 100644
> --- a/sysdeps/aarch64/fpu/s_fminf.c
> +++ b/sysdeps/aarch64/fpu/s_fminf.c
> @@ -16,7 +16,12 @@
>     License along with the GNU C Library; if not, see
>     <http://www.gnu.org/licenses/>.  */
>  
> -#define FUNC fminf
> -#define TYPE float
> -#define REGS "s"
> -#include <fpu/s_fmin.c>
> +#include <math.h>
> +
> +float
> +__fminf (float x, float y)
> +{
> +  return __builtin_fminf (x, y);
> +}
> +
> +weak_alias (__fminf, fminf)
> diff --git a/sysdeps/aarch64/fpu/s_frint.c b/sysdeps/aarch64/fpu/s_frint.c
> deleted file mode 100644
> index 48881f5..0000000
> --- a/sysdeps/aarch64/fpu/s_frint.c
> +++ /dev/null
> @@ -1,49 +0,0 @@
> -/* Copyright (C) 1996-2017 Free Software Foundation, Inc.
> -
> -   This file is part of the GNU C Library.
> -
> -   The GNU C Library is free software; you can redistribute it and/or
> -   modify it under the terms of the GNU Lesser General Public
> -   License as published by the Free Software Foundation; either
> -   version 2.1 of the License, or (at your option) any later version.
> -
> -   The GNU C Library is distributed in the hope that it will be useful,
> -   but WITHOUT ANY WARRANTY; without even the implied warranty of
> -   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> -   Lesser General Public License for more details.
> -
> -   You should have received a copy of the GNU Lesser General Public
> -   License along with the GNU C Library; if not, see
> -   <http://www.gnu.org/licenses/>.  */
> -
> -#include <math.h>
> -
> -#ifndef FUNC
> -# error FUNC not defined
> -#endif
> -
> -#ifndef TYPE
> -# define TYPE double
> -# define REGS "d"
> -#else
> -# ifndef REGS
> -#  error REGS not defined
> -# endif
> -#endif
> -
> -#ifndef INSN
> -# error INSN not defined
> -#endif
> -
> -#define __CONCATX(a,b) __CONCAT(a,b)
> -
> -TYPE
> -__CONCATX(__,FUNC) (TYPE x)
> -{
> -  TYPE result;
> -  asm ( INSN "\t%" REGS "0, %" REGS "1" :
> -	"=w" (result) : "w" (x) );
> -  return result;
> -}
> -
> -weak_alias (__CONCATX(__,FUNC), FUNC)
> diff --git a/sysdeps/aarch64/fpu/s_frintf.c b/sysdeps/aarch64/fpu/s_frintf.c
> deleted file mode 100644
> index dae99d7..0000000
> --- a/sysdeps/aarch64/fpu/s_frintf.c
> +++ /dev/null
> @@ -1,24 +0,0 @@
> -/* Copyright (C) 2011-2017 Free Software Foundation, Inc.
> -
> -   This file is part of the GNU C Library.
> -
> -   The GNU C Library is free software; you can redistribute it and/or
> -   modify it under the terms of the GNU Lesser General Public License as
> -   published by the Free Software Foundation; either version 2.1 of the
> -   License, or (at your option) any later version.
> -
> -   The GNU C Library is distributed in the hope that it will be useful,
> -   but WITHOUT ANY WARRANTY; without even the implied warranty of
> -   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> -   Lesser General Public License for more details.
> -
> -   You should have received a copy of the GNU Lesser General Public
> -   License along with the GNU C Library; if not, see
> -   <http://www.gnu.org/licenses/>.  */
> -
> -#ifndef FUNC
> -# error FUNC not defined
> -#endif
> -#define TYPE float
> -#define REGS "s"
> -#include <s_frint.c>
> diff --git a/sysdeps/aarch64/fpu/s_llrint.c b/sysdeps/aarch64/fpu/s_llrint.c
> index 57821c0..3e41546 100644
> --- a/sysdeps/aarch64/fpu/s_llrint.c
> +++ b/sysdeps/aarch64/fpu/s_llrint.c
> @@ -16,7 +16,19 @@
>     License along with the GNU C Library; if not, see
>     <http://www.gnu.org/licenses/>.  */
>  
> -#define FUNC llrint
> -#define OTYPE long long int
> -#define OREG_SIZE 64
> -#include <s_lrint.c>
> +#include <math.h>
> +#include <math_private.h>
> +
> +long long int
> +__llrint (double x)
> +{
> +  double r = __builtin_rint (x);
> +
> +  // Prevent gcc from calling lrint directly when compiled with -fno-math-errno
> +  // by inserting a barrier
> +
> +  math_opt_barrier (r);

I am not seeing any difference on code generation with or without the
math barrier when using GCC 4.9, GCC7 or GCC mainline with or without
-fno-math-errno (I am seeing just a combination of frintx plus fcvtzs).
Do we really need this barrier here?

Also I think the comment line is too long and I am not sure if it is
usual to use '//' comments (usually we use default '/*' '*/' pairs).

> +  return r;
> +}
> +
> +weak_alias (__llrint, llrint)
> diff --git a/sysdeps/aarch64/fpu/s_llrintf.c b/sysdeps/aarch64/fpu/s_llrintf.c
> index 98ed4f8..baf9027 100644
> --- a/sysdeps/aarch64/fpu/s_llrintf.c
> +++ b/sysdeps/aarch64/fpu/s_llrintf.c
> @@ -16,9 +16,20 @@
>     License along with the GNU C Library; if not, see
>     <http://www.gnu.org/licenses/>.  */
>  
> -#define FUNC llrintf
> -#define ITYPE float
> -#define IREG_SIZE 32
> -#define OTYPE long long int
> -#define OREG_SIZE 64
> -#include <s_lrint.c>
> +#include <math.h>
> +#include <math_private.h>
> +
> +long long int
> +__llrintf (float x)
> +{
> +  float r = __builtin_rintf (x);
> +
> +  // Prevent gcc from calling lrintf directly when compiled with -fno-math-errno
> +  // by inserting a barrier
> +
> +
> +  math_opt_barrier (r);
> +  return r;
> +}
> +
> +weak_alias (__llrintf, llrintf)
> diff --git a/sysdeps/aarch64/fpu/s_llround.c b/sysdeps/aarch64/fpu/s_llround.c
> index ef7aedf..2902946 100644
> --- a/sysdeps/aarch64/fpu/s_llround.c
> +++ b/sysdeps/aarch64/fpu/s_llround.c
> @@ -16,7 +16,12 @@
>     License along with the GNU C Library; if not, see
>     <http://www.gnu.org/licenses/>.  */
>  
> -#define FUNC llround
> -#define OTYPE long long int
> -#define OREG_SIZE 64
> -#include <s_lround.c>
> +#include <math.h>
> +
> +long long int
> +__llround (double x)
> +{
> +  return __builtin_llround (x);
> +}
> +
> +weak_alias (__llround, llround)
> diff --git a/sysdeps/aarch64/fpu/s_llroundf.c b/sysdeps/aarch64/fpu/s_llroundf.c
> index 294f0f4..0ca390b 100644
> --- a/sysdeps/aarch64/fpu/s_llroundf.c
> +++ b/sysdeps/aarch64/fpu/s_llroundf.c
> @@ -16,9 +16,12 @@
>     License along with the GNU C Library; if not, see
>     <http://www.gnu.org/licenses/>.  */
>  
> -#define FUNC llroundf
> -#define ITYPE float
> -#define IREG_SIZE 32
> -#define OTYPE long long int
> -#define OREG_SIZE 64
> -#include <s_lround.c>
> +#include <math.h>
> +
> +long long int
> +__llroundf (float x)
> +{
> +  return __builtin_llroundf (x);
> +}
> +
> +weak_alias (__llroundf, llroundf)
> diff --git a/sysdeps/aarch64/fpu/s_lrint.c b/sysdeps/aarch64/fpu/s_lrint.c
> index 6ef64e2..ba4d7cc 100644
> --- a/sysdeps/aarch64/fpu/s_lrint.c
> +++ b/sysdeps/aarch64/fpu/s_lrint.c
> @@ -19,38 +19,17 @@
>  #include <math.h>
>  #include <get-rounding-mode.h>
>  #include <stdint.h>
> +#include <math_private.h>
>  
> -#ifndef FUNC
> -# define FUNC lrint
> -#endif
> -
> -#ifndef ITYPE
> -# define ITYPE double
>  # define IREG_SIZE 64
> -#else
> -# ifndef IREG_SIZE
> -#  error IREG_SIZE not defined
> -# endif
> -#endif
>  
> -#ifndef OTYPE
> -# define OTYPE long int
>  # ifdef __ILP32__
>  #  define OREG_SIZE 32
>  # else
>  #  define OREG_SIZE 64
>  # endif
> -#else
> -# ifndef OREG_SIZE
> -#  error OREG_SIZE not defined
> -# endif
> -#endif
>  
> -#if IREG_SIZE == 32
> -# define IREGS "s"
> -#else
>  # define IREGS "d"
> -#endif
>  
>  #if OREG_SIZE == 32
>  # define OREGS "w"
> @@ -58,15 +37,14 @@
>  # define OREGS "x"
>  #endif
>  
> -#define __CONCATX(a,b) __CONCAT(a,b)
>  
> -OTYPE
> -__CONCATX(__,FUNC) (ITYPE x)
> +long int
> +__lrint (double x)
>  {
> -  OTYPE result;
> -  ITYPE temp;
>  
>  #if IREG_SIZE == 64 && OREG_SIZE == 32
> +  long int result;
> +
>    if (__builtin_fabs (x) > INT32_MAX)
>      {
>        /* Converting large values to a 32 bit int may cause the frintx/fcvtza
> @@ -96,10 +74,14 @@ __CONCATX(__,FUNC) (ITYPE x)
>        return result;
>      }
>  #endif
> -  asm ( "frintx" "\t%" IREGS "1, %" IREGS "2\n\t"
> -        "fcvtzs" "\t%" OREGS "0, %" IREGS "1"
> -        : "=r" (result), "=w" (temp) : "w" (x) );
> -  return result;
> +
> +  double r =  __builtin_rint (x);
> +
> +  // Prevent gcc from calling lrintf directly when compiled with -fno-math-errno
> +  // by inserting a barrier
> +
> +  math_opt_barrier (r);
> +  return r;
>  }

Same comment as before about math_opt_barrier and the comments seems off (it is
referring to lrintf instead of lrint).

>  
> -weak_alias (__CONCATX(__,FUNC), FUNC)
> +weak_alias (__lrint, lrint)
> diff --git a/sysdeps/aarch64/fpu/s_lrintf.c b/sysdeps/aarch64/fpu/s_lrintf.c
> index 2e73271..8d4d4d0 100644
> --- a/sysdeps/aarch64/fpu/s_lrintf.c
> +++ b/sysdeps/aarch64/fpu/s_lrintf.c
> @@ -16,7 +16,19 @@
>     License along with the GNU C Library; if not, see
>     <http://www.gnu.org/licenses/>.  */
>  
> -#define FUNC lrintf
> -#define ITYPE float
> -#define IREG_SIZE 32
> -#include <s_lrint.c>
> +#include <math.h>
> +#include <math_private.h>
> +
> +long int
> +__lrintf (float x)
> +{
> +  float r = __builtin_rintf (x);
> +
> +  // Prevent gcc from calling lrintf directly when compiled with -fno-math-errno
> +  // by inserting a barrier
> +
> +  math_opt_barrier (r);
> +  return r;
> +}
> +
> +weak_alias (__lrintf, lrintf)
> diff --git a/sysdeps/aarch64/fpu/s_lround.c b/sysdeps/aarch64/fpu/s_lround.c
> index 1f77d82..90c3163 100644
> --- a/sysdeps/aarch64/fpu/s_lround.c
> +++ b/sysdeps/aarch64/fpu/s_lround.c
> @@ -18,53 +18,10 @@
>  
>  #include <math.h>
>  
> -#ifndef FUNC
> -# define FUNC lround
> -#endif
> +long int
> +__lround (double x)
> + {
> +  return __builtin_lround (x);
> + }
>  
> -#ifndef ITYPE
> -# define ITYPE double
> -# define IREG_SIZE 64
> -#else
> -# ifndef IREG_SIZE
> -#  error IREG_SIZE not defined
> -# endif
> -#endif
> -
> -#ifndef OTYPE
> -# define OTYPE long int
> -# ifdef __ILP32__
> -#  define OREG_SIZE 32
> -# else
> -#  define OREG_SIZE 64
> -# endif
> -#else
> -# ifndef OREG_SIZE
> -#  error OREG_SIZE not defined
> -# endif
> -#endif
> -
> -#if IREG_SIZE == 32
> -# define IREGS "s"
> -#else
> -# define IREGS "d"
> -#endif
> -
> -#if OREG_SIZE == 32
> -# define OREGS "w"
> -#else
> -# define OREGS "x"
> -#endif
> -
> -#define __CONCATX(a,b) __CONCAT(a,b)
> -
> -OTYPE
> -__CONCATX(__,FUNC) (ITYPE x)
> -{
> -  OTYPE result;
> -  asm ( "fcvtas" "\t%" OREGS "0, %" IREGS "1"
> -        : "=r" (result) : "w" (x) );
> -  return result;
> -}
> -
> -weak_alias (__CONCATX(__,FUNC), FUNC)
> +weak_alias (__lround, lround)
> diff --git a/sysdeps/aarch64/fpu/s_lroundf.c b/sysdeps/aarch64/fpu/s_lroundf.c
> index b30ddb6..baf0693 100644
> --- a/sysdeps/aarch64/fpu/s_lroundf.c
> +++ b/sysdeps/aarch64/fpu/s_lroundf.c
> @@ -16,7 +16,12 @@
>     License along with the GNU C Library; if not, see
>     <http://www.gnu.org/licenses/>.  */
>  
> -#define FUNC lroundf
> -#define ITYPE float
> -#define IREG_SIZE 32
> -#include <s_lround.c>
> +#include <math.h>
> +
> +long int
> +__lroundf (float x)
> +{
> +  return __builtin_lroundf (x);
> +}
> +
> +weak_alias (__lroundf, lroundf)
> diff --git a/sysdeps/aarch64/fpu/s_nearbyint.c b/sysdeps/aarch64/fpu/s_nearbyint.c
> index 51067f2..6ba5de1 100644
> --- a/sysdeps/aarch64/fpu/s_nearbyint.c
> +++ b/sysdeps/aarch64/fpu/s_nearbyint.c
> @@ -16,6 +16,12 @@
>     License along with the GNU C Library; if not, see
>     <http://www.gnu.org/licenses/>.  */
>  
> -#define	FUNC nearbyint
> -#define INSN "frinti"
> -#include <s_frint.c>
> +#include <math.h>
> +
> +double
> +__nearbyint (double x)
> +{
> +  return __builtin_nearbyint (x);
> +}
> +
> +weak_alias (__nearbyint, nearbyint)
> diff --git a/sysdeps/aarch64/fpu/s_nearbyintf.c b/sysdeps/aarch64/fpu/s_nearbyintf.c
> index 8125646..de69fd9 100644
> --- a/sysdeps/aarch64/fpu/s_nearbyintf.c
> +++ b/sysdeps/aarch64/fpu/s_nearbyintf.c
> @@ -16,6 +16,12 @@
>     License along with the GNU C Library; if not, see
>     <http://www.gnu.org/licenses/>.  */
>  
> -#define FUNC nearbyintf
> -#define INSN "frinti"
> -#include <s_frintf.c>
> +#include <math.h>
> +
> +float
> +__nearbyintf (float x)
> +{
> +  return __builtin_nearbyintf (x);
> +}
> +
> +weak_alias (__nearbyintf, nearbyintf)
> diff --git a/sysdeps/aarch64/fpu/s_rint.c b/sysdeps/aarch64/fpu/s_rint.c
> index 73b4e26..b4ac349 100644
> --- a/sysdeps/aarch64/fpu/s_rint.c
> +++ b/sysdeps/aarch64/fpu/s_rint.c
> @@ -16,6 +16,12 @@
>     License along with the GNU C Library; if not, see
>     <http://www.gnu.org/licenses/>.  */
>  
> -#define FUNC rint
> -#define INSN "frintx"
> -#include <s_frint.c>
> +#include <math.h>
> +
> +double
> +__rint (double x)
> +{
> +  return __builtin_rint (x);
> +}
> +
> +weak_alias (__rint, rint)
> diff --git a/sysdeps/aarch64/fpu/s_rintf.c b/sysdeps/aarch64/fpu/s_rintf.c
> index 3560dc2..d0f70ce 100644
> --- a/sysdeps/aarch64/fpu/s_rintf.c
> +++ b/sysdeps/aarch64/fpu/s_rintf.c
> @@ -16,6 +16,12 @@
>     License along with the GNU C Library; if not, see
>     <http://www.gnu.org/licenses/>.  */
>  
> -#define FUNC rintf
> -#define INSN "frintx"
> -#include <s_frintf.c>
> +#include <math.h>
> +
> +float
> +__rintf (float x)
> +{
> +  return __builtin_rintf (x);
> +}
> +
> +weak_alias (__rintf, rintf)
> diff --git a/sysdeps/aarch64/fpu/s_round.c b/sysdeps/aarch64/fpu/s_round.c
> index 6781748..a34fca1 100644
> --- a/sysdeps/aarch64/fpu/s_round.c
> +++ b/sysdeps/aarch64/fpu/s_round.c
> @@ -16,6 +16,12 @@
>     License along with the GNU C Library; if not, see
>     <http://www.gnu.org/licenses/>.  */
>  
> -#define FUNC round
> -#define INSN "frinta"
> -#include <s_frint.c>
> +#include <math.h>
> +
> +double
> +__round (double x)
> +{
> +  return __builtin_round (x);
> +}
> +
> +weak_alias (__round, round)
> diff --git a/sysdeps/aarch64/fpu/s_roundf.c b/sysdeps/aarch64/fpu/s_roundf.c
> index ef6f672..66c8ee6 100644
> --- a/sysdeps/aarch64/fpu/s_roundf.c
> +++ b/sysdeps/aarch64/fpu/s_roundf.c
> @@ -16,6 +16,12 @@
>     License along with the GNU C Library; if not, see
>     <http://www.gnu.org/licenses/>.  */
>  
> -#define FUNC roundf
> -#define INSN "frinta"
> -#include <s_frintf.c>
> +#include <math.h>
> +
> +float
> +__roundf (float x)
> +{
> +  return __builtin_roundf (x);
> +}
> +
> +weak_alias (__roundf, roundf)
> diff --git a/sysdeps/aarch64/fpu/s_trunc.c b/sysdeps/aarch64/fpu/s_trunc.c
> index 2bf5474..6550dfc 100644
> --- a/sysdeps/aarch64/fpu/s_trunc.c
> +++ b/sysdeps/aarch64/fpu/s_trunc.c
> @@ -16,6 +16,12 @@
>     License along with the GNU C Library; if not, see
>     <http://www.gnu.org/licenses/>.  */
>  
> -#define FUNC trunc
> -#define INSN "frintz"
> -#include <s_frint.c>
> +#include <math.h>
> +
> +double
> +__trunc (double x)
> +{
> +  return __builtin_trunc (x);
> +}
> +
> +weak_alias (__trunc, trunc)
> diff --git a/sysdeps/aarch64/fpu/s_truncf.c b/sysdeps/aarch64/fpu/s_truncf.c
> index 94865a4..b7890a2 100644
> --- a/sysdeps/aarch64/fpu/s_truncf.c
> +++ b/sysdeps/aarch64/fpu/s_truncf.c
> @@ -16,6 +16,12 @@
>     License along with the GNU C Library; if not, see
>     <http://www.gnu.org/licenses/>.  */
>  
> -#define FUNC truncf
> -#define INSN "frintz"
> -#include <s_frintf.c>
> +#include <math.h>
> +
> +float
> +__truncf (float x)
> +{
> +  return __builtin_truncf (x);
> +}
> +
> +weak_alias (__truncf, truncf)
> -- 1.9.1
>
Follow-Ups:
- Re: [PATCH] aarch64: Implement math acceleration via builtins
  - From: Szabolcs Nagy
- Re: [PATCH] aarch64: Implement math acceleration via builtins
  - From: Szabolcs Nagy
References:
- [PATCH] aarch64: Implement math acceleration via builtins
  - From: Michael Collison
Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]