[PATCH v7 1/5] RISC-V: Print highest address on disassembler
Tsukasa OI
research_trasio@irq.a4lg.com
Wed Aug 24 12:06:50 GMT 2022
Hi Nelson, good to hear you again.
I see why you were so busy.
On 2022/08/24 20:22, Nelson Chu wrote:
> Hi Tsukasa,
>
> I think there are three dis-assembler issues here,
>
> 1. PR29342
> 2. The target address of jump
> 3. Should we show the target address when it is -1
>
> Therefore, three separate patches looks reasonable. But I would suggest
> that we should add test cases for each patch as possible as we can,
> rather than add them together at last. That is because it is really
> hard to make sure the correctness of the separate patches.
I apologize about that. I am making a series of patches to make the
disassembler faster and more reliable and... I have following branch
graph. And... that's the only a part of it (that requires
riscv-dis-fix-addr). Aside from the following branches, I have...
eleven... branches that can be directly applied to master.
1. riscv-dis-fix-addr (This patchset)
2. riscv-dis-opts-batch-1
(Roughly 25% performance improvements [more general])
3. riscv-dis-rv32e (disassembler support for RV32E)
4. riscv-dis-data-large (highly debatable: .long dumping on RV64)
5. riscv-dis-generics (fix disassembling some zext.h/pack/packw)
6. riscv-dis-arch-priv-spec
(sometimes "real" arch and priv doesn't match with ELF attrs
[e.g. OpenSBI] and JTAG-based debugging may require ability to
override some disassembler parameters on the fly)
7. riscv-dis-opts-batch-2
(Not general as batch 1 but very effective on libraries; I observed
nearly 10x improvements on disassembling libcrypto.so)
8. riscv-dis-reduce-fp-on-addresses
(reduce false positives on address printing)
Yes, clearly I was impatient. Not splitting up the problem and
effectively forcing all or nothing to you is a rude behavior and I
sincerely apologize about that.
So, let me split this patchset based on your comments.
>
> This patch refers to issue 3. So according to the source code, we won't
> show the final target address when pd->print_addr is -1, which means
> the address is exactly -1 or it is the default value. I think it is
> really rare to jump or refer to the symbol whose value is -1. Besides
> that, not showing the target address when it's value is -1 doesn't look
> wrong to me, so personally I would like to keep the original behavior.
Still I think... if we have a chance to "fix" this (without any
significant performance penalty), why not? So I stick to my original
opinion (except I will split this change into a separate patchset for
separate review).
Thanks,
Tsukasa
> Unless there are other users who really want to change the behavior, or
> llvm are doing something different. However, the change of gp makes
> sense, so it looks to me.
>
> Thanks
> Nelson
>
> On Wed, Aug 24, 2022 at 9:28 AM Tsukasa OI via Binutils
> <binutils@sourceware.org <mailto:binutils@sourceware.org>> wrote:
>
> This patch makes possible to print the highest address (0xffffffff
> on RV32,
> 0xffffffff_ffffffff on RV64). This is particularly useful if the
> highest
> address space is used for I/O registers and corresponding symbols
> are defined.
>
> opcodes/ChangeLog:
>
> * riscv-dis.c (struct riscv_private_data): Add
> `to_print_addr' and
> `has_gp' to enable printing the highest address.
> (maybe_print_address): Utilize `to_print_addr' and `has_gp'.
> (riscv_disassemble_insn): Likewise.
> ---
> opcodes/riscv-dis.c | 22 ++++++++++++++++------
> 1 file changed, 16 insertions(+), 6 deletions(-)
>
> diff --git a/opcodes/riscv-dis.c b/opcodes/riscv-dis.c
> index 164fd209dbd..c6d80c3ba49 100644
> --- a/opcodes/riscv-dis.c
> +++ b/opcodes/riscv-dis.c
> @@ -52,6 +52,8 @@ struct riscv_private_data
> bfd_vma gp;
> bfd_vma print_addr;
> bfd_vma hi_addr[OP_MASK_RD + 1];
> + bool to_print_addr;
> + bool has_gp;
> };
>
> /* Used for mapping symbols. */
> @@ -177,10 +179,13 @@ maybe_print_address (struct riscv_private_data
> *pd, int base_reg, int offset,
> pd->print_addr = (base_reg != 0 ? pd->hi_addr[base_reg] : 0)
> + offset;
> pd->hi_addr[base_reg] = -1;
> }
> - else if (base_reg == X_GP && pd->gp != (bfd_vma)-1)
> + else if (base_reg == X_GP && pd->has_gp)
> pd->print_addr = pd->gp + offset;
> else if (base_reg == X_TP || base_reg == 0)
> pd->print_addr = offset;
> + else
> + return;
> + pd->to_print_addr = true;
>
> /* Sign-extend a 32-bit value to a 64-bit value. */
> if (wide)
> @@ -595,14 +600,19 @@ riscv_disassemble_insn (bfd_vma memaddr,
> insn_t word, disassemble_info *info)
> int i;
>
> pd = info->private_data = xcalloc (1, sizeof (struct
> riscv_private_data));
> - pd->gp = -1;
> - pd->print_addr = -1;
> + pd->gp = 0;
> + pd->print_addr = 0;
> for (i = 0; i < (int)ARRAY_SIZE (pd->hi_addr); i++)
> pd->hi_addr[i] = -1;
> + pd->to_print_addr = false;
> + pd->has_gp = false;
>
> for (i = 0; i < info->symtab_size; i++)
> if (strcmp (bfd_asymbol_name (info->symtab[i]),
> RISCV_GP_SYMBOL) == 0)
> - pd->gp = bfd_asymbol_value (info->symtab[i]);
> + {
> + pd->gp = bfd_asymbol_value (info->symtab[i]);
> + pd->has_gp = true;
> + }
> }
> else
> pd = info->private_data;
> @@ -662,13 +672,13 @@ riscv_disassemble_insn (bfd_vma memaddr,
> insn_t word, disassemble_info *info)
> print_insn_args (op->args, word, memaddr, info);
>
> /* Try to disassemble multi-instruction addressing
> sequences. */
> - if (pd->print_addr != (bfd_vma)-1)
> + if (pd->to_print_addr)
> {
> info->target = pd->print_addr;
> (*info->fprintf_styled_func)
> (info->stream, dis_style_comment_start, " # ");
> (*info->print_address_func) (info->target, info);
> - pd->print_addr = -1;
> + pd->to_print_addr = false;
> }
>
> /* Finish filling out insn_info fields. */
> --
> 2.34.1
>
More information about the Binutils
mailing list