[PATCH v7 1/5] RISC-V: Print highest address on disassembler

Tsukasa OI research_trasio@irq.a4lg.com
Wed Aug 24 12:06:50 GMT 2022


Hi Nelson, good to hear you again.
I see why you were so busy.

On 2022/08/24 20:22, Nelson Chu wrote:
> Hi Tsukasa,
> 
> I think there are three dis-assembler issues here,
> 
> 1. PR29342
> 2. The target address of jump
> 3. Should we show the target address when it is -1
> 
> Therefore, three separate patches looks reasonable.  But I would suggest
> that we should add test cases for each patch as possible as we can,
> rather than add them together at last.  That is because it is really
> hard to make sure the correctness of the separate patches.

I apologize about that.  I am making a series of patches to make the
disassembler faster and more reliable and... I have following branch
graph.  And... that's the only a part of it (that requires
riscv-dis-fix-addr).  Aside from the following branches, I have...
eleven... branches that can be directly applied to master.

1. riscv-dis-fix-addr (This patchset)
  2. riscv-dis-opts-batch-1
     (Roughly 25% performance improvements [more general])
    3. riscv-dis-rv32e (disassembler support for RV32E)
    4. riscv-dis-data-large (highly debatable: .long dumping on RV64)
    5. riscv-dis-generics (fix disassembling some zext.h/pack/packw)
    6. riscv-dis-arch-priv-spec
       (sometimes "real" arch and priv doesn't match with ELF attrs
        [e.g. OpenSBI] and JTAG-based debugging may require ability to
        override some disassembler parameters on the fly)
  7. riscv-dis-opts-batch-2
     (Not general as batch 1 but very effective on libraries; I observed
      nearly 10x improvements on disassembling libcrypto.so)
    8. riscv-dis-reduce-fp-on-addresses
       (reduce false positives on address printing)

Yes, clearly I was impatient.  Not splitting up the problem and
effectively forcing all or nothing to you is a rude behavior and I
sincerely apologize about that.

So, let me split this patchset based on your comments.

> 
> This patch refers to issue 3.  So according to the source code, we won't
> show the final target address when pd->print_addr is -1, which means
> the address is exactly -1 or it is the default value.  I think it is
> really rare to jump or refer to the symbol whose value is -1.  Besides
> that, not showing the target address when it's value is -1 doesn't look
> wrong to me, so personally I would like to keep the original behavior. 

Still I think... if we have a chance to "fix" this (without any
significant performance penalty), why not?  So I stick to my original
opinion (except I will split this change into a separate patchset for
separate review).

Thanks,
Tsukasa

> Unless there are other users who really want to change the behavior, or
> llvm are doing something different.  However, the change of gp makes
> sense, so it looks to me.
> 
> Thanks
> Nelson
> 
> On Wed, Aug 24, 2022 at 9:28 AM Tsukasa OI via Binutils
> <binutils@sourceware.org <mailto:binutils@sourceware.org>> wrote:
> 
>     This patch makes possible to print the highest address (0xffffffff
>     on RV32,
>     0xffffffff_ffffffff on RV64).  This is particularly useful if the
>     highest
>     address space is used for I/O registers and corresponding symbols
>     are defined.
> 
>     opcodes/ChangeLog:
> 
>             * riscv-dis.c (struct riscv_private_data): Add
>     `to_print_addr' and
>             `has_gp' to enable printing the highest address.
>             (maybe_print_address): Utilize `to_print_addr' and `has_gp'.
>             (riscv_disassemble_insn): Likewise.
>     ---
>      opcodes/riscv-dis.c | 22 ++++++++++++++++------
>      1 file changed, 16 insertions(+), 6 deletions(-)
> 
>     diff --git a/opcodes/riscv-dis.c b/opcodes/riscv-dis.c
>     index 164fd209dbd..c6d80c3ba49 100644
>     --- a/opcodes/riscv-dis.c
>     +++ b/opcodes/riscv-dis.c
>     @@ -52,6 +52,8 @@ struct riscv_private_data
>        bfd_vma gp;
>        bfd_vma print_addr;
>        bfd_vma hi_addr[OP_MASK_RD + 1];
>     +  bool to_print_addr;
>     +  bool has_gp;
>      };
> 
>      /* Used for mapping symbols.  */
>     @@ -177,10 +179,13 @@ maybe_print_address (struct riscv_private_data
>     *pd, int base_reg, int offset,
>            pd->print_addr = (base_reg != 0 ? pd->hi_addr[base_reg] : 0)
>     + offset;
>            pd->hi_addr[base_reg] = -1;
>          }
>     -  else if (base_reg == X_GP && pd->gp != (bfd_vma)-1)
>     +  else if (base_reg == X_GP && pd->has_gp)
>          pd->print_addr = pd->gp + offset;
>        else if (base_reg == X_TP || base_reg == 0)
>          pd->print_addr = offset;
>     +  else
>     +    return;
>     +  pd->to_print_addr = true;
> 
>        /* Sign-extend a 32-bit value to a 64-bit value.  */
>        if (wide)
>     @@ -595,14 +600,19 @@ riscv_disassemble_insn (bfd_vma memaddr,
>     insn_t word, disassemble_info *info)
>            int i;
> 
>            pd = info->private_data = xcalloc (1, sizeof (struct
>     riscv_private_data));
>     -      pd->gp = -1;
>     -      pd->print_addr = -1;
>     +      pd->gp = 0;
>     +      pd->print_addr = 0;
>            for (i = 0; i < (int)ARRAY_SIZE (pd->hi_addr); i++)
>             pd->hi_addr[i] = -1;
>     +      pd->to_print_addr = false;
>     +      pd->has_gp = false;
> 
>            for (i = 0; i < info->symtab_size; i++)
>             if (strcmp (bfd_asymbol_name (info->symtab[i]),
>     RISCV_GP_SYMBOL) == 0)
>     -         pd->gp = bfd_asymbol_value (info->symtab[i]);
>     +         {
>     +           pd->gp = bfd_asymbol_value (info->symtab[i]);
>     +           pd->has_gp = true;
>     +         }
>          }
>        else
>          pd = info->private_data;
>     @@ -662,13 +672,13 @@ riscv_disassemble_insn (bfd_vma memaddr,
>     insn_t word, disassemble_info *info)
>               print_insn_args (op->args, word, memaddr, info);
> 
>               /* Try to disassemble multi-instruction addressing
>     sequences.  */
>     -         if (pd->print_addr != (bfd_vma)-1)
>     +         if (pd->to_print_addr)
>                 {
>                   info->target = pd->print_addr;
>                   (*info->fprintf_styled_func)
>                     (info->stream, dis_style_comment_start, " # ");
>                   (*info->print_address_func) (info->target, info);
>     -             pd->print_addr = -1;
>     +             pd->to_print_addr = false;
>                 }
> 
>               /* Finish filling out insn_info fields.  */
>     -- 
>     2.34.1
> 


More information about the Binutils mailing list