This is the mail archive of the
binutils@sourceware.org
mailing list for the binutils project.
Re: [PATCH 2/4] i386: Align branches within a fixed boundary
On Fri, Dec 6, 2019 at 1:31 AM Jan Beulich <jbeulich@suse.com> wrote:
>
> On 06.12.2019 02:04, H.J. Lu wrote:
> > On Thu, Dec 5, 2019 at 12:04 AM Jan Beulich <jbeulich@suse.com> wrote:
> >>
> >> On 05.12.2019 00:55, H.J. Lu wrote:
> >>> On Wed, Dec 4, 2019 at 2:57 AM Jan Beulich <jbeulich@suse.com> wrote:
> >>>>
> >>>> On 03.12.2019 20:02, H.J. Lu wrote:
> >>>>> +static int
> >>>>> +maybe_fused_with_jcc_p (void)
> >>>>> +{
> >>>>> + /* No RIP address. */
> >>>>> + if (i.base_reg && i.base_reg->reg_num == RegIP)
> >>>>> + return 0;
> >>>>> +
> >>>>> + /* and, add, sub with destination register. */
> >>>>> + if ((i.tm.base_opcode | 5) == 0x25
> >>>>> + || (i.tm.base_opcode | 5) == 0x5
> >>>>> + || (i.tm.base_opcode >= 0x28 && i.tm.base_opcode <= 0x2d)
> >>>>
> >>>> Why not the same "| 5" approach here as above? And why then not
> >>>> further fold AND and SUB (using "| 0xd" along the lines of ...
> >>>
> >>> It turns out that (i.tm.base_opcode | 5) == 0x25 doesn't work since
> >>> 5 doesn't contain consecutive bits. I need to use
> >>>
> >>> i.tm.base_opcode >= 0x20 && i.tm.base_opcode <= 0x25
> >>
> >> Then use "| 7", as suggested further down, since ...
> >
> > "| 7" == 0x27 will match 0x20 to 0x27. But we only want 0x20 to 0x25.
>
> I understand this, but the situation here is the same as below.
>
> >>>>> + || ((i.tm.base_opcode | 3) == 0x83
> >>>>> + && ((i.tm.extension_opcode | 1) == 0x5
> >>>>
> >>>> ... this?
> >>>>
> >>>>> + || i.tm.extension_opcode == 0x0)))
> >>>>> + return (i.types[1].bitfield.class == Reg
> >>>>> + || i.types[1].bitfield.instance == Accum);
> >>
> >> ... this will produce 0 for opcodes 0x26 and 0x27.
> >
> > Are you suggesting checking i.types for opcode? I
> > don't think it is a good idea.
>
> I'm afraid I don't understand "checking i.types for opcode". What
> I've been trying to explain is that, because here we're way past
> any parsing and template matching, you can take for given that the
> specified operands (i.e. i.types[]) match the recorded template
> (i.tm). I.e. for opcodes 0x26 and 0x27 the i.types[] checks will
> ensure that the function returns 0.
>
> >>>>> + /* test, cmp with any register. */
> >>>>> + if ((i.tm.base_opcode | 1) == 0x85
> >>>>> + || (i.tm.base_opcode | 1) == 0xa9
> >>>>> + || (i.tm.base_opcode | 1) == 0xf7
> >>>>
> >>>> This needs further qualification by extension opcode, to avoid
> >>>> covering not, neg, mul, etc.
> >>>
> >>> I am adding i.tm.extension_opcode == 0.
> >>>
> >>>>> + || (i.tm.base_opcode >= 0x38 && i.tm.base_opcode <= 0x3d)
> >>>>
> >>>> Along the lines of the above use "| 7" here, seeing that ...
> >>>
> >>> We can't use "| 7" since it will match 0x3e and 0x3f.
> >>>
> >>>>> + || ((i.tm.base_opcode | 3) == 0x83
> >>>>> + && (i.tm.extension_opcode == 0x7)))
> >>>>> + return (i.types[0].bitfield.class == Reg
> >>>>> + || i.types[0].bitfield.instance == Accum
> >>>>> + || i.types[1].bitfield.class == Reg
> >>>>> + || i.types[1].bitfield.instance == Accum);
> >>>>
> >>>> ... this will also be correct (returning 0) for AAS?
> >>>
> >>> AAS can't be fused with jcc.
> >>
> >> Of course, but my comment did cover this fact: The expression of
> >> the return statement will produce 0 for it, which is what you
> >> want (aiui).
> >
> > We shouldn't check types for opcode.
>
> As above - I don't see / understand the problem.
>
I want to only check base_opcode and extension_opcode for
instructions. i.types is only used to check for operands. It is
more future proof.
--
H.J.