This is the mail archive of the
libc-alpha@sourceware.org
mailing list for the glibc project.
Re: [PATCH 01/28] powerpc: Use generic fabs{f} implementations
- From: "Gabriel F. T. Gomes" <gabriel at inconstante dot eti dot br>
- To: Adhemerval Zanella <adhemerval dot zanella at linaro dot org>
- Cc: Joseph Myers <joseph at codesourcery dot com>, <libc-alpha at sourceware dot org>
- Date: Mon, 15 Apr 2019 17:23:56 -0300
- Subject: Re: [PATCH 01/28] powerpc: Use generic fabs{f} implementations
- References: <20190329133529.22523-1-adhemerval.zanella@linaro.org> <20190329133529.22523-2-adhemerval.zanella@linaro.org> <alpine.DEB.2.21.1904012003110.27216@digraph.polyomino.org.uk> <b2c8dfa0-2a1c-1f6c-ba29-4cbe7d639faf@linaro.org>
On Wed, Apr 03 2019, Adhemerval Zanella wrote:
>
>
> On 02/04/2019 03:04, Joseph Myers wrote:
> > On Fri, 29 Mar 2019, Adhemerval Zanella wrote:
> >
> >> Since be2e25bbd78f9fdf the generic ieee754 implementation uses
> >> compiler builtin which generates fabs{f} for all supported targets.
> >
> > One reason for the existing version might be a microoptimization for code
> > size to make the float and double versions aliases, as permitted by the
> > ABIs and instruction set in this case. (This is not an objection to this
> > patch.)
> >
>
> Indeed powerpc does such microoptimizations for some implementations, but
The code generated by these functions is as follows (on powerpc64le):
Before the patch:
000000000004dac0 <fabs>:
4dac0: 0e 00 4c 3c addis r2,r12,14
4dac4: 40 9b 42 38 addi r2,r2,-25792
4dac8: 10 0a 20 fc fabs f1,f1
4dacc: 20 00 80 4e blr
After the patch:
000000000004dac0 <fabs>:
4dac0: 10 0a 20 fc fabs f1,f1
4dac4: 20 00 80 4e blr
(this is true when --enable-stack-protector is not in use, or when it is
set to strong, as is the case for the distros I checked (Debian, Fedora
and OpenSuse); when set to all, there would be a lot more differences
before and after the patch, as the assembly implementation wouldn't get
stack protection code, but that's not a problem, imo).
(also, the absence of the TOC-setup code is fine, as the function does
not need anything from the TOC).
> if the idea is indeed to push such optimizations couldn't add on generic
> implementation though a flag defined in arch-specific headers?
Given the size of these functions, I believe that this is not worth the
effort, and I believe this patch is OK as is.
Reviewed-by: Gabriel F. T. Gomes <gabriel@inconstante.eti.br>