RFC: MIPS: Workaround the "daddi" and "daddiu" errata
Maciej W. Rozycki
macro@ds2.pg.gda.pl
Thu Mar 18 00:39:00 GMT 2004
On Sat, 6 Mar 2004, Richard Sandiford wrote:
> > As you may know there are errata in the R4000 and the initial revision of
> > the R4400 processor, that lead to an incorrect execution of the "daddi"
> > and the "daddiu" instructions whan a two-complement overflow happens. I
> > believe the only way to work this problem around is to assume the affected
> > processors don't support these instructions at all. Fortunately, this can
> > be quite easily done by pretending these instructions are actually
> > macros and expanding them to a sequence of a "li" ("addiu" to $0) and a
> > "dadd" or a "daddu" instruction with an aid of a temporary register, like
> > this:
> >
> > daddiu $2,$3,0x1234 -> addiu $1,$0,0x1234; daddu $2,$3,$1
> > daddi $2,$3,0x1234 -> addiu $1,$0,0x1234; dadd $2,$3,$1
> >
> > This works quite well when $1 is available and could be implemented with
> > gas exclusively for that case. Unfortunately, "daddiu" is quite commonly
> > used in expansions of address references in macros.
>
> I guess there are two main ways of approaching an errata like this:
>
> (a) Stop gas & gcc from ever using daddiu.
> (b) Try to justify whether each use of daddiu is safe. Use an
> alternative if not.
>
> You've gone for (a) here, which I suppose you could argue is safer.
> Problem is, the patch ends up touching far more code than (b) would,
> and introduces far more special cases. I think it's going to be very
> hard to maintain the patch in the long-term. From that point of view,
> it could be argued that it's actually safer to restrict the workaround
> to the cases that really matter. There's less chance of things silently
> breaking in future.
I'm not so sure. I wanted to make absolutely sure the output is
verifiable for correctness and this achieved. You can run e.g. `objdump
-S <file> | grep 'daddi[u]*' on a binary and see a problem immediately.
It can also be handled quite easily by the kernel or the dynamic linker
when mmap()ping executable pages -- simply by scanning for the bad
opcodes. I'd say your proposal is more fragile -- you have to study
generated code thoroughly for every "daddiu" instruction to check if the
erratum can be triggered. And run-time validation is essentially
impossible (we don't want to do static analysis in the kernel, do we?).
> A quick look at the gcc patch suggests that the changes fall into
> three categories:
>
> (1) daddiu used to add an integer constant whose value is known to gcc
>
> Related hunks:
> * adddi3_internal_3
> * changes to other patterns that I don't think are necessary
>
> (2) daddiu used to add %lo(x) to an incomplete address
>
> Hunks related to general addresses:
> * mips_symbol_insns
> * mips_secondary_reload_class
> * all(?) 64-bit non-mips16 .md patterns that have a load or store
> alternative
> * various reload patterns
>
> Hunks related to branch handling:
> * mips_conditional_register_usage
> * mips_output_load_label
> * mips_output_conditional_branch
>
> (3) daddiu used to allocate and deallocate stack space in gcc
>
> Related hunks:
> * ASM_OUTPUT_REG_PUSH
> * ASM_OUTPUT_REG_POP
>
> And the errata we're working around is that:
>
> daddiu dst,src,xxxx
>
> will give the incorrect result in the following cases:
>
> (+) src = 0x7fff_ffff_ffff_yyyy && xxxx + yyyy > 0x10000
>
> RESULT: dst will be 0x0000_0000_0000_zzzz
> rather than 0x8000_0000_0000_zzzz
>
> (-) src = 0x8000_0000_0000_yyyy && -xxxx > yyyy
>
> RESULT: dst will be 0xffff_ffff_ffff_zzzz
> rather than 0x7fff_ffff_ffff_zzzz
>
> So looking at each of these in turn:
>
> (1) Definitely needs a workaround, no question about that.
>
> (2) Adding %lo(X) to the high part of X can only trigger case (-)
> since yyyy (the low 16 bits of src) will be 0. In this case,
> X must be of the form 0x7fff_ffff_ffff_8???, which is an invalid
> address, right? So there's no way this bug could trigger if X
> refers to a valid in-memory object.
>
> Of course, X could still be some kind of constant defined in a
> linker script, but, well... don't do that if you care about
> targets with a buggy daddiu. It hardly seems to justify the
> kind of code bloat it causes in the common case, nor the big
> maintenance burden.
>
> The same argument applies when "src" contains the high part of
> X plus an index. If X refers to an in-memory object, no valid
> index can cause the daddiu to overflow.
>
> Unfortunately, I don't know of any specific code in gcc to stop:
>
> &foo[y - x]
>
> from being expanded as:
>
> ...load high part of foo into dst...
> daddu dst,dst,y
> daddiu dst,dst,%lo(foo)
> dsubu dst,dst,x
>
> so perhaps we could trip over the errata when x and y are both
> humungous values. But I don't think this case justifies treating
> "daddiu ...,%lo(...)" as a complete no-no. Again, it leads to
> large code bloat and has a high maintenance burden.
>
> If you can demonstrate that &foo[y - x] is indeed a problem then
> I'd prefer a patch that addresses it head-on.
Note that even if the address is invalid, code should work correctly --
the address may be further offsetted later. Or code may expect a bad
address and deal with it somehow. Or there may be a real bug in code --
it should really get an exception then, instead of silently using a
different address.
> (3) In the case of ASM_OUTPUT_REG_PUSH, you're catering for case (-);
> i.e. the case where the stack pointer goes from 0x8000... to
> 0x7fff.... Since this represents the stack pointer moving from
> xkphys to an invalid address, it surely represents a program gone
> wrong. Even the original stack pointer value is a bit suspect ;)
>
> Perhaps you could argue that, since 0xffff... is a valid address,
> the behaviour of the broken daddiu is less friendly for error-detection
> and debugging purposes. But I don't think that's enough to justify
> the extra gcc maintenance burden, nor the cost of using two instructions
> to allocate small chunks of stack.
>
> Similar arguments apply for the POP case.
>
> Your patch also includes changes to patterns like ffsdi. This seems
> a bit over-the-top; there's no way counting from 0 to 64 can trigger
> overflow.
It's just the consequence of getting rid of the instructions universally,
which was my goal.
Note that a lot of these modifications is due to macros that are expanded
by gas to code sequences including a "daddiu". If we limit the workaround
to the explicit reloc case (which can now be done thanks to your recent
no-PIC support addition) a lot of changes can go away. That would be a
reasonable and acceptable for me approach.
Then, if all the patterns that want a "daddiu" call something like
emit_daddiu_insn() instead of emitting the instruction explicitly, then
suddenly the changes required may become limited to really a few places.
If you still worry about maintenance hassle of the change, then feel free
to offload it to me. Though with an easy detectability of a breakage, as
demonstrated above, I'll probably be first to notice it and provide an
update.
> Anyway, concentrating on (1):
>
> > @@ -1030,17 +1066,41 @@
> > (match_dup 3)))]
> > "")
> >
> > -(define_insn "adddi3_internal_3"
> > +(define_expand "adddi3_internal_3"
> > [(set (match_operand:DI 0 "register_operand" "=d,d")
> > (plus:DI (match_operand:DI 1 "reg_or_0_operand" "dJ,dJ")
> > (match_operand:DI 2 "arith_operand" "d,Q")))]
> > "TARGET_64BIT && !TARGET_MIPS16"
> > - "@
> > - daddu\t%0,%z1,%2
> > - daddiu\t%0,%z1,%2"
> > + "")
> > +
> > +(define_insn "adddi3_internal_3a"
> > + [(set (match_operand:DI 0 "register_operand" "=d")
> > + (plus:DI (match_operand:DI 1 "reg_or_0_operand" "dJ")
> > + (match_operand:DI 2 "register_operand" "d")))]
> > + "TARGET_64BIT && !TARGET_MIPS16"
> > + "daddu\t%0,%z1,%2"
> > + [(set_attr "type" "darith")
> > + (set_attr "mode" "DI")])
> > +
> > +(define_insn "adddi3_internal_3b"
> > + [(set (match_operand:DI 0 "register_operand" "=d")
> > + (plus:DI (match_operand:DI 1 "reg_or_0_operand" "dJ")
> > + (match_operand:DI 2 "const_arith_operand" "Q")))]
> > + "TARGET_64BIT && !TARGET_MIPS16 && !TARGET_NO_DADDI"
> > + "daddiu\t%0,%z1,%2"
> > [(set_attr "type" "darith")
> > (set_attr "mode" "DI")])
> >
> > +(define_insn "adddi3_internal_3c"
> > + [(set (match_operand:DI 0 "register_operand" "=d")
> > + (plus:DI (match_operand:DI 1 "reg_or_0_operand" "dJ")
> > + (match_operand:DI 2 "const_arith_operand" "Q")))]
> > + "TARGET_64BIT && !TARGET_MIPS16 && TARGET_NO_DADDI"
> > + "%[addiu\\t%@,%.,%2\;daddu\t%0,%z1,%@%]"
> > + [(set_attr "type" "darith")
> > + (set_attr "mode" "DI")
> > + (set_attr "length" "8")])
> > +
> > ;; For the mips16, we need to recognize stack pointer additions
> > ;; explicitly, since we don't have a constraint for $sp. These insns
> > ;; will be generated by the save_restore_insns functions.
>
> Please don't split the existing pattern. One pattern with several
> alternatives gives reload more freedom than several patterns with
> single alternatives.
I wasn't aware of this -- is it missing from documentation, or wasn't I
careful enough reading it?
> I suggest the TARGET_NO_DADDIU version look something like:
>
> (define_insn "adddi3_internal_3_nodaddiu"
> [(set (match_operand:DI 0 "register_operand" "=d,d")
> (plus:DI (match_operand:DI 1 "reg_or_0_operand" "dJ,dJ")
> (match_operand:DI 2 "arith_operand" "d,Q")))]
> "TARGET_64BIT && TARGET_NO_DADDIU"
> {
> if (which_alternative == 0)
> return "daddu\t%0,%z1,%2";
>
> if (operands[1] == stack_pointer_rtx)
> return "daddiu\t%0,%z1,%2";
>
> return "%[addiu\t%@,%.,%2\;daddu\t%0,%z1,%@%]";
> }
> [(set_attr "type" "darith")
> (set_attr "length" "8,4")])
>
> ...completely untested, might not even be syntactically valid.
> This overestimates the length of the stack pointer case,
> but that's not a problem.
I'll investigate your proposal. As I'm essentially done with my patches
functionality-wise, I'll do a short upgrade cycle to get in sync with the
trunk and then I'll get back with an update.
I don't think there can be any reduction done on the gas patch, OTOH.
With a suitable test case, code should be fairly easy to maintain, though.
Maciej
--
+ Maciej W. Rozycki, Technical University of Gdansk, Poland +
+--------------------------------------------------------------+
+ e-mail: macro@ds2.pg.gda.pl, PGP key available +
More information about the Binutils
mailing list