[PATCH 1/3] RISC-V: re-arrange opcode table for consistent alias handling

Fangrui Song i@maskray.me
Fri Jul 14 21:25:19 GMT 2023


On Wed, Jul 12, 2023 at 1:15 AM Jan Beulich <jbeulich@suse.com> wrote:
>
> On 11.07.2023 23:02, Fangrui Song wrote:
> > On Fri, Sep 16, 2022 at 2:53 AM Nelson Chu <nelson@rivosinc.com> wrote:
> >>
> >> On Thu, Sep 15, 2022 at 3:42 PM Jan Beulich <jbeulich@suse.com> wrote:
> >>>
> >>> On 15.09.2022 04:30, Nelson Chu wrote:
> >>>> On Tue, Sep 13, 2022 at 9:02 PM Jan Beulich <jbeulich@suse.com> wrote:
> >>>>> --- a/opcodes/riscv-opc.c
> >>>>> +++ b/opcodes/riscv-opc.c
> >>>>> @@ -290,9 +290,9 @@ const struct riscv_opcode riscv_opcodes[
> >>>>>  {"jalr",        0, INSN_CLASS_I, "d,s,j",     MATCH_JALR, MASK_JALR, match_opcode, INSN_JSR },
> >>>>>  {"j",           0, INSN_CLASS_C, "Ca",        MATCH_C_J, MASK_C_J, match_opcode, INSN_ALIAS|INSN_BRANCH },
> >>>>>  {"j",           0, INSN_CLASS_I, "a",         MATCH_JAL, MASK_JAL|MASK_RD, match_opcode, INSN_ALIAS|INSN_BRANCH },
> >>>>> +{"jal",         0, INSN_CLASS_I, "a",         MATCH_JAL|(X_RA << OP_SH_RD), MASK_JAL|MASK_RD, match_opcode, INSN_ALIAS|INSN_JSR },
> >>>>>  {"jal",         0, INSN_CLASS_I, "d,a",       MATCH_JAL, MASK_JAL, match_opcode, INSN_JSR },
> >>>>>  {"jal",        32, INSN_CLASS_C, "Ca",        MATCH_C_JAL, MASK_C_JAL, match_opcode, INSN_ALIAS|INSN_JSR },
> >>>>> -{"jal",         0, INSN_CLASS_I, "a",         MATCH_JAL|(X_RA << OP_SH_RD), MASK_JAL|MASK_RD, match_opcode, INSN_ALIAS|INSN_JSR },
> >>>>>  {"call",        0, INSN_CLASS_I, "d,c",       (X_T1 << OP_SH_RS1), (int) M_CALL, match_never, INSN_MACRO },
> >>>>>  {"call",        0, INSN_CLASS_I, "c",         (X_RA << OP_SH_RS1)|(X_RA << OP_SH_RD), (int) M_CALL, match_never, INSN_MACRO },
> >>>>>  {"tail",        0, INSN_CLASS_I, "c",         (X_T1 << OP_SH_RS1), (int) M_CALL, match_never, INSN_MACRO },
> >>>>> @@ -310,13 +310,13 @@ const struct riscv_opcode riscv_opcodes[
> >>>>>  {"move",        0, INSN_CLASS_C, "d,CV",      MATCH_C_MV, MASK_C_MV, match_c_add, INSN_ALIAS },
> >>>>>  {"move",        0, INSN_CLASS_I, "d,s",       MATCH_ADDI, MASK_ADDI|MASK_IMM, match_opcode, INSN_ALIAS },
> >>>>>  {"zext.b",      0, INSN_CLASS_I, "d,s",       MATCH_ANDI|ENCODE_ITYPE_IMM (255), MASK_ANDI | MASK_IMM, match_opcode, INSN_ALIAS },
> >>>>> -{"andi",        0, INSN_CLASS_C, "Cs,Cw,Co",  MATCH_C_ANDI, MASK_C_ANDI, match_opcode, INSN_ALIAS },
> >>>>> -{"andi",        0, INSN_CLASS_I, "d,s,j",     MATCH_ANDI, MASK_ANDI, match_opcode, 0 },
> >>>>>  {"and",         0, INSN_CLASS_C, "Cs,Cw,Ct",  MATCH_C_AND, MASK_C_AND, match_opcode, INSN_ALIAS },
> >>>>>  {"and",         0, INSN_CLASS_C, "Cs,Ct,Cw",  MATCH_C_AND, MASK_C_AND, match_opcode, INSN_ALIAS },
> >>>>>  {"and",         0, INSN_CLASS_C, "Cs,Cw,Co",  MATCH_C_ANDI, MASK_C_ANDI, match_opcode, INSN_ALIAS },
> >>>>>  {"and",         0, INSN_CLASS_I, "d,s,t",     MATCH_AND, MASK_AND, match_opcode, 0 },
> >>>>>  {"and",         0, INSN_CLASS_I, "d,s,j",     MATCH_ANDI, MASK_ANDI, match_opcode, INSN_ALIAS },
> >>>>> +{"andi",        0, INSN_CLASS_C, "Cs,Cw,Co",  MATCH_C_ANDI, MASK_C_ANDI, match_opcode, INSN_ALIAS },
> >>>>> +{"andi",        0, INSN_CLASS_I, "d,s,j",     MATCH_ANDI, MASK_ANDI, match_opcode, 0 },
> >>>>
> >>>> Doesn't ANDI a base instruction?
> >>>
> >>> Of course. Like for all aliases, there is a corresponding base
> >>> instruction. I guess I simply don't understand what you mean to
> >>> express with the question.
> >>>
> >>>>  The operand "d,s,j" of AND is an
> >>>> alias of ANDI, so the original order seems correct.  Always dump *.i
> >>>> instructions to the non-i type looks weird, and llvm-dump seems has
> >>>> the same behavior as current GNU objdump.
> >>>>
> >>>> % cat tmp.s
> >>>> and a0, a1, 0x10
> >>>> % riscv64-unknown-elf-as tmp.s -o tmp.o
> >>>> % riscv64-unknown-elf-objdump -d tmp.o
> >>>>
> >>>> tmp.o:     file format elf64-littleriscv
> >>>>
> >>>>
> >>>> Disassembly of section .text:
> >>>>
> >>>> 0000000000000000 <.text>:
> >>>>
> >>>>    0: 0105f513          and a0,a1,16
> >>>
> >>> What's weird about that? And if that's weird, would you mind spelling
> >>> out the conditions under which aliases are to be preferred over base
> >>> instructions when disassembling? There actually is a "These aliases are
> >>> for assembly but not disassembly" comment somewhere in the file,
> >>> clarifying for two of the aliases that they ought to come after their
> >>> base insns. But for all other aliases which aren't simply a different
> >>> (but not shorter) name for the same insn (e.g. "bgt" vs "blt") I'd
> >>> assume the aliases should be preferred, for the reason stated in the
> >>> patch description. That said - I can see it being a matter of taste
> >>> for <insn>i vs <insn>, but if so this should be spelled out somewhere.
> >>
> >> Yeah, that's what I worried about.  At the beginning, I think dumping
> >> a base instruction as another base instruction looks weird.  But these
> >> days I also noticed that - we also dump compressed instructions as
> >> base i without "c." prefixes, so why I feel weird is just that I'm
> >> used to it because of historical behavior.  I have no objection to
> >> this, so please go ahead if there are no objections for a period of
> >> time.  But if there are any objections, then we probably can mark
> >> these aliases by something like INSN_ALIAS_CANNOT_DUMP in the opcode
> >> table, and that's what Kito suggested to me before, but I didn't think
> >> it was a serious problem at the time.
> >>
> >> Thanks
> >> Nelson
> >
> > I apologize as I haven't read all prior discussions. For many
> > instructions, the "i" form is written in the ISA manual and prevalent.
>
> Why "prevalent"? The "i"-less forms are mentioned there as well, aren't
> they? Then why not use them ...

I think aliases like "add rd,rs,imm" (without "i") should be treated
as deprecated aliases that we keep just for compatibility.
We should not endorse the use cases by making objdump -d "prefer" this form.

When I brought up the topic in the #riscv channel on libera.chat, I
got a lot of complaints about having these "deprecated" aliases from
folks, including courmisch, jrtc27, muurkha, sorear.
As I happen to know two folks' email addresses, I have CCed them in this thread.

> > I wonder whether we can give these add/and/xor/etc without "i" lower
> > priority so that objdump -d will not show them, even without using -M
> > no-aliases.
>
> ... unless use of aliases was suppressed? In other arches' assembly
> that I know (to some degree) and which knows the concept of aliases,
> aliases are typically the preferred way of disassembling, for
> typically producing easier to grok output.

AIUI aliases for other architectures are indeeded preferred (use my
understanding of llvm-objdump -d output).
This RISC-V case is not, though.


> There are other aspects to consider here, related to the handling of
> equates in assembly sources. I did bring this up before, but it feels
> like a minefield - it first would need firmly establishing what exactly
> assembler behavior is intended to be. Aiui no-one has properly thought
> of this, including for the (surprisingly similar) handling in tc-mips.c
> (making me guess that RISC-V code may have been derived from that).
>
> Jan

My knowledge about GNU assembler is quite limited...
I hope that other RISC-V folks can answer this question.


> > % cat b.s
> > add a0,a1,13
> > and a2,a3,4
> > xor a2,a3,4
> > or a2,a3,4
> > sll a2,a3,4
> > % riscv64-linux-gnu-gcc -c b.s
> > % ~/Dev/binutils-gdb/out/riscv64/binutils/objdump -d b.o
> >
> > b.o:     file format elf64-littleriscv
> >
> >
> > Disassembly of section .text:
> >
> > 0000000000000000 <.text>:
> >    0:   00d58513                add     a0,a1,13
> >    4:   0046f613                and     a2,a3,4
> >    8:   0046c613                xor     a2,a3,4
> >    c:   0046e613                or      a2,a3,4
> >   10:   00469613                sll     a2,a3,0x4
> >
> >
> > When LLVM integrated assembler added these aliases
> > (https://reviews.llvm.org/D50046), these instructions are assigned a
> > low priority "let EmitPriority = 0" so llvm-objdump -d will never show
> > them.
>


More information about the Binutils mailing list