This is the mail archive of the
libc-alpha@sourceware.org
mailing list for the glibc project.
Re: [PATCH, alpha]: Fix sysdeps/alpha/remqu.S clobbering $f3 reg
On Thu, Jan 24, 2019 at 9:23 AM Richard Henderson <rth@twiddle.net> wrote:
>
> On 1/18/19 5:06 AM, Uros Bizjak wrote:
> > Hello!
> >
> > Attached patch fixes sysdeps/alpha/remqu.S clobbering $f3 register via
> > $y_is_neg path. There was missing restore of $f3 before the return
> > from the function.
> >
> > The patch also reorders insns a bit, so it becomes similar as much as
> > possible to divqu.S.
> >
> > Without the patch, math/big testcase from Go-1.11 testsuite (that
> > includes lots of corner cases that exercise remqu) FAIL, with patched
> > function, the testcase PASSes without problems.
>
>
> > +++ b/sysdeps/alpha/remqu.S
> > @@ -59,20 +59,19 @@ __remqu:
> > subq Y, 1, AT
> > stt $f0, 0(sp)
> > and Y, AT, AT
> > + excb
> > + beq AT, $powerof2
> >
> > stt $f1, 8(sp)
> > - excb
>
> Why are you moving the excb above the powerof2 branch?
> The path at powerof2 does not touch fpcr or issue fp insns.
This was meant to unify the flow with the __divqu assembly, which does
the above before calling DIVBYZERO. The idea was that __divqu is used
much more than __remqu, so the later should do the same as the former.
> > @@ -94,12 +93,12 @@ __remqu:
> > mulq AT, Y, AT
> > ldt $f0, 0(sp)
> > ldt $f3, 48(sp)
> > - lda sp, FRAME(sp)
> > cfi_remember_state
> > cfi_restore ($f0)
> > cfi_restore ($f1)
> > cfi_restore ($f3)
> > cfi_def_cfa_offset (0)
> > + lda sp, FRAME(sp)
>
> This change is actively wrong wrt the unwind info.
Again, this will match __divqu assembly. It looks that __divqu needs
to be fixed then.
> > @@ -246,12 +247,16 @@ $y_is_neg:
> > quotient must be either 0 or 1, so the remainder must be X
> > or X-Y, so just compute it directly. */
> > cmpule Y, X, AT
> > + excb
> > + mt_fpcr $f3
> > subq X, Y, RV
> > ldt $f0, 0(sp)
> > + ldt $f3, 48(sp)
> > cmoveq AT, X, RV
> >
> > lda sp, FRAME(sp)
> > cfi_restore ($f0)
> > + cfi_restore ($f3)
> > cfi_def_cfa_offset (0)
> > ret $31, (RA), 1
>
> This appears to be the only change required to fix the bug.
That is true. This part is the problematic part and clobbers $f3.
Should I resend the patch only with this part fixed?
> Can you walk me through why the other changes?
As said above, I was trying to make __remqu like __divqu, but it looks
that __divqu should be fixed in some places.
Thanks,
Uros.