This is the mail archive of the
libc-alpha@sourceware.org
mailing list for the glibc project.
Re: [PATCH] powerpc: Fix build of wcscpy with --disable-multi-arch
On 05/03/2019 11:48, Wilco Dijkstra wrote:
> Hi,
>
>> sysdeps/powerpc/powerpc64/fpu/multiarch/s_copysign.c
>
>> But I am not sure it is really pays off in term of simplify glibc code
>> and build. Maybe an option is just to use __builtin_* when we know
>> compiler generates an efficient instruction and just use the generic
>> implementation instead.
>
> Indeed. Simple builtin functions like these are inlined by compilers,
> and that's always faster than calling a GLIBC version which might
> use an ifunc.
>
> That means you will practically never see calls to copysign, so it
> does not make sense to optimize it using ifuncs. A large proportion of
> ifuncs do nothing for performance, they only increase complexity and
> make improving generic code more difficult and error prone.
>
> If we focus more effort on optimizing generic code, we actually need
> even fewer target specific optimizations (as the generic string and math
> optimizations have proven).
These kind of optimizations might made sense when the default deployment
was for chips which do not support fast instructions or when compiler did
not have a better alternative implemented. If I recall correctly I
implemented these when most distro used for powerpc64 was RHEL6 and its
compiler (gcc 4.4 if I am not mistaken) was targeting power4 and did not
such builtin optimization.
Now GCC 8.3 generates the code sequence for -mcpu=power4:
.L.__copysign1:
.LFB0:
.cfi_startproc
stfd 2,-16(1)
fabs 1,1
nop
nop
ld 9,-16(1)
cmpdi 7,9,0
bgelr 7
fneg 1,1
blr
Taking in consideration PLT overhead, it might still be faster than calling
the libm implementation which will select fcpsgn.
And I agree we with you that nowadays mostly powerpc ifunc implementation are
not adding anything in term of performance. Maybe a future would be indeed
cleanup useless ifunc variants and just use the compiler builtins.
>
> Cheers,
> Wilco
>