This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH] aarch64: Implement math acceleration via builtins


On 13/10/17 23:36, Adhemerval Zanella wrote:
> On 13/10/2017 18:26, Michael Collison wrote:
>> This patch converts asm statements into builtin and neon intrinsics for AArch64.
>> As an example for the file sysdeps/aarch64/fpu/_s_ceil.c, we convert the function from
>>
>> double
>> __ceil (double x)
>> {
>>   double result;
>>   asm ("frintp\t%d0, %d1" :
>>        "=w" (result) : "w" (x) );
>>   return result;
>> }
>>
>> into
>>
>> double
>> __ceil (double x)
>> {
>>   return __builtin_ceil (x);
>> }
>>
...
> Looks good in general with some comments below.  Is there any advantage
> of the refactor besides avoid inline assembly?

i think this is a code clean up patch (the generated code
should not change other than temp register numbers), it
avoids the current preprocessor hackery and inline asm.

> Since you refactoring it, I would also suggest to use builtin for the 
> aarch64 match_private.h sqrt and get/set fpcr at fpu_control.h.
> 

note that's no longer a code clean up patch:

e.g. gcc does not know the cost of inline asm so it cannot
schedule it optimally, but it has detailed model of the
builtins so using them in headers should be an improvement.

Wilco has some sqrt patches, so sqrt can be left alone
(if those patches don't progress then it can be fixed later)
https://sourceware.org/ml/libc-alpha/2017-09/msg00958.html
https://sourceware.org/ml/libc-alpha/2017-09/msg00959.html
https://sourceware.org/ml/libc-alpha/2017-09/msg00960.html

i think using builtins for fpcr/fpsr is a good idea.

__builtin_aarch64_get_fpcr
__builtin_aarch64_set_fpcr
__builtin_aarch64_get_fpsr
__builtin_aarch64_set_fpsr

please check if earliest supported gcc (4.9) have them,
i think fixing this can be a separate patch.

>> +long long int
>> +__llrint (double x)
>> +{
>> +  double r = __builtin_rint (x);
>> +
>> +  // Prevent gcc from calling lrint directly when compiled with -fno-math-errno
>> +  // by inserting a barrier
>> +
>> +  math_opt_barrier (r);
...
> 
> Also I think the comment line is too long and I am not sure if it is
> usual to use '//' comments (usually we use default '/*' '*/' pairs).
> 

please fix the comment style.

>> +long int
>> +__lrint (double x)
>>  {
>> -  OTYPE result;
>> -  ITYPE temp;
>>  
>>  #if IREG_SIZE == 64 && OREG_SIZE == 32
>> +  long int result;
>> +
>>    if (__builtin_fabs (x) > INT32_MAX)
>>      {
>>        /* Converting large values to a 32 bit int may cause the frintx/fcvtza
>> @@ -96,10 +74,14 @@ __CONCATX(__,FUNC) (ITYPE x)
>>        return result;
>>      }
>>  #endif
>> -  asm ( "frintx" "\t%" IREGS "1, %" IREGS "2\n\t"
>> -        "fcvtzs" "\t%" OREGS "0, %" IREGS "1"
>> -        : "=r" (result), "=w" (temp) : "w" (x) );
>> -  return result;
>> +
>> +  double r =  __builtin_rint (x);
>> +
>> +  // Prevent gcc from calling lrintf directly when compiled with -fno-math-errno
>> +  // by inserting a barrier
>> +
>> +  math_opt_barrier (r);
>> +  return r;
>>  }
> 
> Same comment as before about math_opt_barrier and the comments seems off (it is
> referring to lrintf instead of lrint).
> 

please fix the comment.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]