[PATCH v7 1/5] RISC-V: Print highest address on disassembler

Nelson Chu nelson@rivosinc.com
Thu Aug 25 05:07:29 GMT 2022


On Wed, Aug 24, 2022 at 8:06 PM Tsukasa OI <research_trasio@irq.a4lg.com> wrote:
>
> Hi Nelson, good to hear you again.
> I see why you were so busy.
>
> On 2022/08/24 20:22, Nelson Chu wrote:
> > Hi Tsukasa,
> >
> > I think there are three dis-assembler issues here,
> >
> > 1. PR29342
> > 2. The target address of jump
> > 3. Should we show the target address when it is -1
> >
> > Therefore, three separate patches looks reasonable.  But I would suggest
> > that we should add test cases for each patch as possible as we can,
> > rather than add them together at last.  That is because it is really
> > hard to make sure the correctness of the separate patches.
>
> I apologize about that.  I am making a series of patches to make the
> disassembler faster and more reliable and... I have following branch
> graph.  And... that's the only a part of it (that requires
> riscv-dis-fix-addr).  Aside from the following branches, I have...
> eleven... branches that can be directly applied to master.
>
> 1. riscv-dis-fix-addr (This patchset)
>   2. riscv-dis-opts-batch-1
>      (Roughly 25% performance improvements [more general])
>     3. riscv-dis-rv32e (disassembler support for RV32E)
>     4. riscv-dis-data-large (highly debatable: .long dumping on RV64)
>     5. riscv-dis-generics (fix disassembling some zext.h/pack/packw)
>     6. riscv-dis-arch-priv-spec
>        (sometimes "real" arch and priv doesn't match with ELF attrs
>         [e.g. OpenSBI] and JTAG-based debugging may require ability to
>         override some disassembler parameters on the fly)
>   7. riscv-dis-opts-batch-2
>      (Not general as batch 1 but very effective on libraries; I observed
>       nearly 10x improvements on disassembling libcrypto.so)
>     8. riscv-dis-reduce-fp-on-addresses
>        (reduce false positives on address printing)
>
> Yes, clearly I was impatient.  Not splitting up the problem and
> effectively forcing all or nothing to you is a rude behavior and I
> sincerely apologize about that.
>
> So, let me split this patchset based on your comments.

Thanks for doing this, it will make the review easier and also make
sure the correctness of each patch.

> >
> > This patch refers to issue 3.  So according to the source code, we won't
> > show the final target address when pd->print_addr is -1, which means
> > the address is exactly -1 or it is the default value.  I think it is
> > really rare to jump or refer to the symbol whose value is -1.  Besides
> > that, not showing the target address when it's value is -1 doesn't look
> > wrong to me, so personally I would like to keep the original behavior.
>
> Still I think... if we have a chance to "fix" this (without any
> significant performance penalty), why not?  So I stick to my original
> opinion (except I will split this change into a separate patchset for
> separate review).

OK, I just prefer not to change the behavior if it isn't really a
serious bug, in case other project's test cases or some analysis tools
may broken, since we are changing the dis-assembler results.  But
yeah, -1 may still be used as address, even if it is corner used.
Anyway, I'm okay for this change for now, if there are no objections
in a while. I think we just add these changes to the NEWS, then people
will know it's time to update their test cases or tools.

Thanks
Nelson

> Thanks,
> Tsukasa
>
> > Unless there are other users who really want to change the behavior, or
> > llvm are doing something different.  However, the change of gp makes
> > sense, so it looks to me.
> >
> > Thanks
> > Nelson
> >
> > On Wed, Aug 24, 2022 at 9:28 AM Tsukasa OI via Binutils
> > <binutils@sourceware.org <mailto:binutils@sourceware.org>> wrote:
> >
> >     This patch makes possible to print the highest address (0xffffffff
> >     on RV32,
> >     0xffffffff_ffffffff on RV64).  This is particularly useful if the
> >     highest
> >     address space is used for I/O registers and corresponding symbols
> >     are defined.
> >
> >     opcodes/ChangeLog:
> >
> >             * riscv-dis.c (struct riscv_private_data): Add
> >     `to_print_addr' and
> >             `has_gp' to enable printing the highest address.
> >             (maybe_print_address): Utilize `to_print_addr' and `has_gp'.
> >             (riscv_disassemble_insn): Likewise.
> >     ---
> >      opcodes/riscv-dis.c | 22 ++++++++++++++++------
> >      1 file changed, 16 insertions(+), 6 deletions(-)
> >
> >     diff --git a/opcodes/riscv-dis.c b/opcodes/riscv-dis.c
> >     index 164fd209dbd..c6d80c3ba49 100644
> >     --- a/opcodes/riscv-dis.c
> >     +++ b/opcodes/riscv-dis.c
> >     @@ -52,6 +52,8 @@ struct riscv_private_data
> >        bfd_vma gp;
> >        bfd_vma print_addr;
> >        bfd_vma hi_addr[OP_MASK_RD + 1];
> >     +  bool to_print_addr;
> >     +  bool has_gp;
> >      };
> >
> >      /* Used for mapping symbols.  */
> >     @@ -177,10 +179,13 @@ maybe_print_address (struct riscv_private_data
> >     *pd, int base_reg, int offset,
> >            pd->print_addr = (base_reg != 0 ? pd->hi_addr[base_reg] : 0)
> >     + offset;
> >            pd->hi_addr[base_reg] = -1;
> >          }
> >     -  else if (base_reg == X_GP && pd->gp != (bfd_vma)-1)
> >     +  else if (base_reg == X_GP && pd->has_gp)
> >          pd->print_addr = pd->gp + offset;
> >        else if (base_reg == X_TP || base_reg == 0)
> >          pd->print_addr = offset;
> >     +  else
> >     +    return;
> >     +  pd->to_print_addr = true;
> >
> >        /* Sign-extend a 32-bit value to a 64-bit value.  */
> >        if (wide)
> >     @@ -595,14 +600,19 @@ riscv_disassemble_insn (bfd_vma memaddr,
> >     insn_t word, disassemble_info *info)
> >            int i;
> >
> >            pd = info->private_data = xcalloc (1, sizeof (struct
> >     riscv_private_data));
> >     -      pd->gp = -1;
> >     -      pd->print_addr = -1;
> >     +      pd->gp = 0;
> >     +      pd->print_addr = 0;
> >            for (i = 0; i < (int)ARRAY_SIZE (pd->hi_addr); i++)
> >             pd->hi_addr[i] = -1;
> >     +      pd->to_print_addr = false;
> >     +      pd->has_gp = false;
> >
> >            for (i = 0; i < info->symtab_size; i++)
> >             if (strcmp (bfd_asymbol_name (info->symtab[i]),
> >     RISCV_GP_SYMBOL) == 0)
> >     -         pd->gp = bfd_asymbol_value (info->symtab[i]);
> >     +         {
> >     +           pd->gp = bfd_asymbol_value (info->symtab[i]);
> >     +           pd->has_gp = true;
> >     +         }
> >          }
> >        else
> >          pd = info->private_data;
> >     @@ -662,13 +672,13 @@ riscv_disassemble_insn (bfd_vma memaddr,
> >     insn_t word, disassemble_info *info)
> >               print_insn_args (op->args, word, memaddr, info);
> >
> >               /* Try to disassemble multi-instruction addressing
> >     sequences.  */
> >     -         if (pd->print_addr != (bfd_vma)-1)
> >     +         if (pd->to_print_addr)
> >                 {
> >                   info->target = pd->print_addr;
> >                   (*info->fprintf_styled_func)
> >                     (info->stream, dis_style_comment_start, " # ");
> >                   (*info->print_address_func) (info->target, info);
> >     -             pd->print_addr = -1;
> >     +             pd->to_print_addr = false;
> >                 }
> >
> >               /* Finish filling out insn_info fields.  */
> >     --
> >     2.34.1
> >


More information about the Binutils mailing list