This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH] Add a new macro to mask a float


Joseph Myers <joseph@codesourcery.com> writes:

> On Tue, 28 Jun 2016, Tulio Magno Quites Machado Filho wrote:
>
>> +/* Faster to do an in-place masking of the float number in the VSR
>> +   than move to GPR for the masking and back.  maskl, maskr, and maski
>> +   are used to convert the 32-bit "mask" parameter to a 64-bit mask
>> +   suitable for the internal representation of a scalar
>> +   single-precision floating point number in the Power8 processor.
>> +   Note: before applying the mask, xvmovdp is used to ensure f is
>> +   normalized.  */
>
> Actually, could you clarify what that internal representation is, and what 
> "to ensure f is normalized" is about?  Is this macro definition exactly 
> equivalent to the integer masking, including for subnormal arguments and 
> NaNs?

That's just an optimization.  A SP denormal here could cause the CPU to waste
some cycles.

Adhemerval Zanella <adhemerval.zanella@linaro.org> writes:

> On 29/06/2016 14:34, Joseph Myers wrote:
>> On Wed, 29 Jun 2016, Adhemerval Zanella wrote:
>> 
>>> My understanding of this optimization is to just make the the FP to GPR move,
>>> bitwise operation and GRP to FP move again to a more simple bitwise operation
>>> on FP register itself.  It is indeed equivalent to integer masking and I 
>>> believe the 'normalized' here means to make the float mask to represented
>>> as internal double required in VSX operations.
>> 
>> What do you mean by "internal double"?  Is this purely some fixed 
>> rearrangement of bits, so that e.g. subnormal float values still get 
>> represented as subnormals rather than like normal doubles?
>
> In fact the float number are converted in double value, so 0x1p-149f would
> be represented internally in the VSX register as
> v4_int32 = {0x0, 0x0, 0x0, 0x36a00000}.  And in fact this is an issue
> (below).
>
>> 
>> Say the number is the least subnormal float - 0x1p-149f, integer 
>> representation 1 - and that it's masked with 0xfffff000, as in the various 
>> MASK_FLOAT calls.  Can you confirm that the instruction sequence in the 
>> patch produces 0.0f, as the integer masking does, when executed on a 
>> POWER8 processor?  And that if instead the value is 0x1p-137f, it's 
>> returned unchanged?
>> 
>> If equivalent to integer masking for all inputs including subnormals and 
>> infinities and NaNs, then my previous point applies that this should be a 
>> compiler optimization instead of a glibc patch.
>> 
>
> Now that you raised these questioning I do not think this change is safe
> for float values in POWER.  Current patch does:
>
>      __asm__ ("xvmovdp %x2, %x2\n\t"				\
> 	     "xxland %x0, %x2, %1\n\t"				\
>
> And I think 'xvmovdp' here is not what it really meant (it is 
> Copy Sign Double-Precision).  I think what the algorithm meant was in fact:
>
>     __asm__ ("xvcvdpsp %x2, %x2\n\t"                            \
>              "xxland %x0, %x2, %1\n\t"                          \
>              "xvcvspdp %x0, %x0" \

Exactly.  After making that change, you can also simplify the mask treatment,
making it trivial for the compiler to do this optimization.

I'll forward this to GCC.

Thank you!

-- 
Tulio Magno


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]