This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH] Add a new macro to mask a float



On 28/06/2016 18:00, Joseph Myers wrote:
> On Tue, 28 Jun 2016, Tulio Magno Quites Machado Filho wrote:
> 
>> +/* Faster to do an in-place masking of the float number in the VSR
>> +   than move to GPR for the masking and back.  maskl, maskr, and maski
>> +   are used to convert the 32-bit "mask" parameter to a 64-bit mask
>> +   suitable for the internal representation of a scalar
>> +   single-precision floating point number in the Power8 processor.
>> +   Note: before applying the mask, xvmovdp is used to ensure f is
>> +   normalized.  */
> 
> Actually, could you clarify what that internal representation is, and what 
> "to ensure f is normalized" is about?  Is this macro definition exactly 
> equivalent to the integer masking, including for subnormal arguments and 
> NaNs?
> 
> If it's exactly equivalent in all cases, including subnormals and NaNs, 
> then my previous comment applies - it would be better as a compiler 
> optimization.  If it's only equivalent for normal values but the code in 
> question can't get subnormal arguments / NaNs / whatever values it's not 
> equivalent for, then doing this in glibc is more plausible, though there 
> are coding style issues, the macro comments would need to explain the 
> limitation, and it would be necessary to be sure in each case that problem 
> arguments can't get there.
> 

My understanding of this optimization is to just make the the FP to GPR move,
bitwise operation and GRP to FP move again to a more simple bitwise operation
on FP register itself.  It is indeed equivalent to integer masking and I 
believe the 'normalized' here means to make the float mask to represented
as internal double required in VSX operations.

So the code:

float foo (float x)
{
  MASK_FLOAT(x, 0xfffff000);
  return x;
}

Is currently optimized on GCC 4.8 as:

foo:
        xscvdpspn 12,1
        mfvsrd 9,12
        srdi 9,9,32
        rlwinm 9,9,0,0,19
        sldi 10,9,32
        mtvsrd 1,10
        xscvspdpn 1,1
        blr

And with this patch as:

foo:
0:      addis 2,12,.TOC.-0b@ha
        addi 2,2,.TOC.-0b@l
        .localentry     foo,.-foo
        addis 9,2,.LC1@toc@ha
        lfs 0,.LC1@toc@l(9)
        xvmovdp 1, 1
        xxland 1, 1, 0
        blr
.LC1:
        .4byte  4294963200

Taking in consideration the constant will be in current TOC on the function
it will require just one float load (lfs) to get the flag.  


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]