[PATCH, alpha]: Add earlyclobber to sqrtt/sqrtf insns.
Uros Bizjak
ubizjak@gmail.com
Fri Apr 14 18:01:00 GMT 2017
On Fri, Apr 14, 2017 at 3:55 PM, Richard Henderson <rth@twiddle.net> wrote:
> On 04/14/2017 05:30 AM, Uros Bizjak wrote:
>>
>> Add earlyclobber to sqrtt/sqrtf insns.
>>
>> When using software completions, we have to prevent assembler to match
>> input and output operands of sqrtt/sqrtf insn. Add earlyclobber to
>> output operand to avoid unwanted operand matching.
>>
>> 2017-04-14 Uros Bizjak <ubizjak@gmail.com>
>>
>> * sysdeps/alpha/fpu/math_private.h (__ieee754_sqrt): Add
>> earlyclobber to output operand of sqrt insn.
>> (__ieee754_sqrtf): Ditto.
>>
>> diff --git a/sysdeps/alpha/fpu/math_private.h
>> b/sysdeps/alpha/fpu/math_private.h
>> index 9e06e25..1e97c86 100644
>> --- a/sysdeps/alpha/fpu/math_private.h
>> +++ b/sysdeps/alpha/fpu/math_private.h
>> @@ -27,9 +27,9 @@ __ieee754_sqrt (double d)
>> {
>> double ret;
>> # ifdef _IEEE_FP_INEXACT
>> - asm ("sqrtt/suid %1,%0" : "=f"(ret) : "f"(d));
>> + asm ("sqrtt/suid %1,%0" : "=&f"(ret) : "f"(d));
>> # else
>> - asm ("sqrtt/sud %1,%0" : "=f"(ret) : "f"(d));
>> + asm ("sqrtt/sud %1,%0" : "=&f"(ret) : "f"(d));
>> # endif
>
>
> Hmm. This is surprising because any host that has sqrtt also has exact
> traps, and so trap shadows and recovery of the input shouldn't be an issue.
I have tried with a sqrtt/sud $f0, $f0, where the value in $f0 was
0x0....10. I can confirm that the returned result was 0.
> That said, do we actually still support a gcc that doesn't have these as
> builtins? That would be more ideal than inline asm.
Compiling
double test_sqrt (double d)
{
return __builtin_sqrt (d);
}
with -O2 -mcpu=ev6 resulted in:
test_sqrt:
.frame $30,16,$26,0
.mask 0x4000000,-16
$LFB0:
.cfi_startproc
ldah $29,0($27) !gpdisp!1
lda $29,0($29) !gpdisp!1
$test_sqrt..ng:
sqrtt $f16,$f0
lda $30,-16($30)
.cfi_def_cfa_offset 16
stq $26,0($30)
.cfi_offset 26, -16
.prologue 1
cmpteq $f0,$f0,$f10
fbeq $f10,$L4
$L2:
ldq $26,0($30)
lda $30,16($30)
.cfi_remember_state
.cfi_restore 26
.cfi_def_cfa_offset 0
ret $31,($26),1
$L4:
.cfi_restore_state
ldq $27,sqrt($29) !literal!2
jsr $26,($27),sqrt !lituse_jsr!2
ldah $29,0($26) !gpdisp!3
lda $29,0($29) !gpdisp!3
br $31,$L2
__builtin_sqrt expands with a call to sqrt(), this could possibly
create a recursive call loop if the builtin is used in libc to
implement sqrt.
Uros.
More information about the Libc-alpha
mailing list