This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH 00/10] Optimized math routines



On 09/07/2018 12:41, Szabolcs Nagy wrote:
> On 09/07/18 15:26, Adhemerval Zanella wrote:
>> On 09/07/2018 10:34, Szabolcs Nagy wrote:
>>> On 09/07/18 14:09, Adhemerval Zanella wrote:
>>>> On 09/07/2018 09:15, Szabolcs Nagy wrote:
>>>>> built and tested on a power8 machine now, glibc math
>>>>> tests pass (except for an unrelated fmal failure),
>>>>> benchmark improvements are consistent with aarch64/x86_64,
>>>>> but it was a shared access machine so i won't post exact
>>>>> numbers, sincosf improved a bit too, sinf/cosf didn't
>>>>> (apparently powerpc has its own implementation).
>>>>
>>>> PowerPC sinf/cosf uses the same algorithm used on x86, I presume
>>>> it would be a gain to generic implementation as well.
>>>>
>>>
>>> you mean the new implementation would be better or the
>>> target specific one?
>>>
>>> new implementation has better latency on this particular
>>> powerpc machine than the target specific code, but
>>> throughput is worse sometimes (using the default 0
>>> setting for PREFER_FLOAT_COMPARISON).
>>
>> I did not measure, but I would expect.  PowerPC uses an different
>> implementation for generic code (s_sinf-ppc64.c) so comparing against
>> it maybe misleading (since it use the old implementation still).
>>
> 
> i'm comparing two glibc builds, they both still use the
> same (old) code for sinf/cosf so there is nothing misleading.

I meant comparing the generic s_sinf against powerpc's one (since
default ifunc selection for powerpc is not the generic is not
sysdeps/ieee754/flt-32/s_sinf.c). But indeed this is not for
this case, sorry for the noise.

> 
> the sincosf code is generic though and the new implementation
> does show some speedup.
> 
>> I am not sure which compiler you used for evaluation, but at least
>> Ubuntu 16.04 one (gcc 5.4) does not use POWER8 ISA as default and
>> even with -mcpu=power8 it generates subpar code.  I will try to
>> check with a GCC 7.1 (but as for your environment, I am using
>> a shared machine, although it I think I might get slight better
>> results because it uses a micro-partition).
>>
> 
> i built a gcc 7.3.0 toolchain, the host toolchain would
> not be able to build glibc (gcc-4.8), same for the host
> make (3.82). (it's a gcc build farm machine)

Using glibc benchmarks, I also see better results with power8 
implementation:

s_sinf-power8:

  "sinf": {
   "": {
    "duration": 5.12725e+09,
    "iterations": 7.03494e+08,
    "max": 983.08,
    "min": 6.06,
    "mean": 7.28827
   }
  }


s_sinf-ppc64:

  "sinf": {
   "": {
    "duration": 5.13064e+09,
    "iterations": 1.86048e+08,
    "max": 1032.52,
    "min": 8.035,
    "mean": 27.577
   }
  }

generic s_sinf:

  "sinf": {
   "": {
    "duration": 5.12404e+09,
    "iterations": 6.74424e+08,
    "max": 515.97,
    "min": 6.089,
    "mean": 7.59765
   }
  }

One remark is I think we can get rid of generic powerpc sinf
(sysdeps/powerpc/fpu/s_sinf.c) and use generic implementation
instead.

> 
>> For PREFER_FLOAT_COMPARISON, do we use this on glibc? I think
>> it is only enabled on optimized-routines, isn't it?
> 
> it is disabled by default, it is there so targets can enable
> it if float compares are faster than using the representation,
> currently disabled everywhere in glibc.
> (i don't want to change that setting now, that case can
> be tweaked later)


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]