This is the mail archive of the
binutils@sourceware.org
mailing list for the binutils project.
Re: [PATCH v3] Work around the NOP issue of Loongson2F
- From: Wu Zhangjin <wuzhangjin at gmail dot com>
- To: Richard Sandiford <rdsandiford at googlemail dot com>
- Cc: Nick Clifton <nickc at redhat dot com>, "Alfred M. Szmidt" <ams at gnu dot org>, zhangfx at lemote dot com, yanh at lemote dot com, binutils at sourceware dot org
- Date: Wed, 18 Nov 2009 09:10:56 +0800
- Subject: Re: [PATCH v3] Work around the NOP issue of Loongson2F
- References: <1258470143-1797-1-git-send-email-wuzhangjin@gmail.com> <87aaykrg7j.fsf@firetop.home>
- Reply-to: wuzhangjin at gmail dot com
On Tue, 2009-11-17 at 21:31 +0000, Richard Sandiford wrote:
[...]
>
> > On Tue, 2009-11-17 at 18:11 +0800, zhangfx wrote:
> > > I think due to the source of the bug, we can leave alone .align, .fill
> > > implementations. Only when they are used in a sequence of short basic
> > > blocks they can be dangerous.
> > >
> > > e.g.
> > > beq L1;
> > > nop
> > > L1:
> > > beq L2;
> > > nop
> > > L2:
> > > beq L3;
> > > nop
> > > ...
> > >
> >
> > Thanks, later, I will send the patch out without considering the .align
> > & .fill issue.
>
> Can the issue occur with:
>
> beq L1;
> any insn <--- note
> nop
> L1:
> beq L2;
> nop
> L2:
> beq L3;
> nop
>
> ?
> If so, then we do need to handle the .align case, since it is quite
> common to use ".align" before labels.
>
> Even if the answer's no, I'm uncomfortable with one kind of nop being
> treated differently.
The same to me, I will try to find out how does the .align and .fill
works and why it uses the 0 as the nop but doesn't use the NOP_INSN
initialized in md_begin(). what about your suggestion here? I have tried
to return NOP_INSN->insn_opcode in mips_nop_opcode() and modify
mips_handle_align():
- md_number_to_chars (p, mips16_nop_insn.insn_opcode, 2);
+ md_number_to_chars (p, NOP_INSN->insn_opcode, 2);
It did generate the source code like this: 8250825, but not what we
want. and if I only change the function mips_nop_opcode(), the result is
also wrong. is there a convenient method to do the whole thing
for .align and .fill?
>
> > "The nature of the erratum is deeply related to the microarchitecture of
> > Loongson-2. It uses roughly a 4-way superscalar dynamically scheduled core,
> > instructions are excuted as much as possible in parallel with technics like
> > branch prediction etc. We use a 8-entry internal branch prediction queue to
> > keep track of each predicted branches, if some branches are proved to be
> > wrongly predicted, all the instructions following it will be cancelled,together
> > with the resources used by them, including the registers used for renaming, and
> > the queue entry will be freeed. There is a bug that might cause a hang when the
> > queue is full(some resources might been leaked due to conflict branch entries),
> > the workaround is to reduce the possiblity of branch queue full by using
> > renaming registers(they are also limited, can prevent too many simutaneos
> > branches). In theory this is still not enough to fully eliminate possible
> > hangs, but the possiblity is extremely low now and hard to be hit in real
> > code."
>
> That's a good description of the problem, but I'm still not sure I understand
> why using $at (in particular) is the chosen workaround.
>
They told me $at, or something else is okay, the key to prevent problem
is something like this:
addu at, at, zero
addu v0, v0, zero
......
We use $at here is only one of the choices.
> > @@ -7361,6 +7367,11 @@ macro (struct mips_cl_insn *ip)
> > move_register (dreg, sreg);
> > break;
> >
> > + case M_NOP:
> > + if (mips_fix_loongson2f)
> > + append_insn (NOP_INSN, NULL, unused_reloc);
> > + break;
> > +
>
> This should be unconditional. "nop" has to expand to something
> regardless of whether -mfix-loongson2f is used.
Okay, will remove the condition later.
>
> > +00000000 <.text>:
> > +.*: 00200825 move \$1,\$1
> > +.*: 10000001 b 0xc
> > +.*: 00200825 move \$1,\$1
> > +.*: 00200825 move \$1,\$1
> > + ...
> > +.*: 00200825 move \$1,\$1
>
> It isn't really appropriate to replace the final address with ".*" here,
> since we want to check that the move is at offset 0x20.
get it, thanks!
Regards,
Wu Zhangjin