This is the mail archive of the
binutils@sourceware.org
mailing list for the binutils project.
RFC: Is it OK to disassemble instructions twice ?
- From: Nick Clifton <nickc at redhat dot com>
- To: binutils at sourceware dot org
- Date: Fri, 16 Aug 2019 09:19:41 +0100
- Subject: RFC: Is it OK to disassemble instructions twice ?
Hi Guys,
Attached below is a patch for PR24907, which is being triggered by a
known problem in the disassembler. Specifically when we are
disassembling an instruction we need to know if a relocation applies
to it. But in order to do that accurately we need to know the length
of the instruction.
The old code used a simple heuristic that assumed that the instruction
would be the same length as a previous one, but with variable width
architectures this is not always the case. The change I am proposing
in this patch is to disassemble the instruction twice. The first time
a dummy print function is supplied, and all that really matters is
that the disassembler returns the number of octets consumed by the
instruction. This then provides an accurate length for the
relocation tests. The second time the instruction is disassembled for
real.
So my question to you guys is - do you think that it will be a problem
if we effectively slow down disassembly by a factor of two, at least
for binaries that have relocations to be applied ?
Cheers
Nick
binutils/ChangeLog
2019-08-15 Nick Clifton <nickc@redhat.com>
PR 24907
* objdump.c (disassemble_bytes): Use the disassembler function
with a dummy printer to determine the length of the instruction
that it about to be displayed. Use this accurate length
information to determine if the function has a reloc associated
with it.
gas/ChangeLog
2019-08-15 Nick Clifton <nickc@redhat.com>
PR 24907
* testsuite/gas/arm/pr24907.s: New test.
* testsuite/gas/arm/pr24907.d: New test driver.
diff --git a/binutils/objdump.c b/binutils/objdump.c
index fffbcf876d..9543c49149 100644
--- a/binutils/objdump.c
+++ b/binutils/objdump.c
@@ -1830,6 +1830,12 @@ objdump_sprintf (SFILE *f, const char *format, ...)
#define DEFAULT_SKIP_ZEROES_AT_END 3
+static int
+null_print (const void * stream ATTRIBUTE_UNUSED, const char * format ATTRIBUTE_UNUSED, ...)
+{
+ return 1;
+}
+
/* Disassemble some data in memory between given values. */
static void
@@ -1897,10 +1903,7 @@ disassemble_bytes (struct disassemble_info * inf,
{
bfd_vma z;
bfd_boolean need_nl = FALSE;
- int previous_octets;
- /* Remember the length of the previous instruction. */
- previous_octets = octets;
octets = 0;
/* Make sure we don't use relocs from previous instructions. */
@@ -1984,26 +1987,46 @@ disassemble_bytes (struct disassemble_info * inf,
&& *relppp < relppend)
{
bfd_signed_vma distance_to_rel;
+ int insn_size = 0;
distance_to_rel = (**relppp)->address
- (rel_offset + addr_offset);
+ if (distance_to_rel > 0)
+ {
+ if (insn_width)
+ insn_size = insn_width;
+ else
+ {
+ /* FIXME: We want to determine if the reloc that we
+ have just located applies to the instruction that
+ we are about to disassemble. In order to do this
+ we need to determine the instruction's length
+ since the reloc might apply part way through the
+ insn. Since on many architectures instruction
+ lengths can vary, we need to make this test on a
+ per instruction basis.
+
+ We find the length by calling the dissassembler
+ function with a dummy print handler. This should
+ work unless the disassembler is not expecting to
+ be called multiple times for the same address.
+ A better solution would be to have a separate
+ target specific function which just computes the
+ length of an instruction at a given address and
+ returns that. */
+ inf->fprintf_func = (fprintf_ftype) null_print;
+ insn_size = disassemble_fn (section->vma
+ + addr_offset, inf);
+ inf->fprintf_func = (fprintf_ftype) objdump_sprintf;
+ }
+ }
+
/* Check to see if the current reloc is associated with
the instruction that we are about to disassemble. */
if (distance_to_rel == 0
- /* FIXME: This is wrong. We are trying to catch
- relocs that are addressed part way through the
- current instruction, as might happen with a packed
- VLIW instruction. Unfortunately we do not know the
- length of the current instruction since we have not
- disassembled it yet. Instead we take a guess based
- upon the length of the previous instruction. The
- proper solution is to have a new target-specific
- disassembler function which just returns the length
- of an instruction at a given address without trying
- to display its disassembly. */
|| (distance_to_rel > 0
- && distance_to_rel < (bfd_signed_vma) (previous_octets/ opb)))
+ && distance_to_rel < (bfd_signed_vma) (insn_size / opb)))
{
inf->flags |= INSN_HAS_RELOC;
aux->reloc = **relppp;
--- /dev/null 2019-08-15 09:18:53.599929903 +0100
+++ gas/testsuite/gas/arm/pr24907.s 2019-08-15 17:19:51.754123014 +0100
@@ -0,0 +1,16 @@
+ .syntax unified
+ .text
+ .thumb
+
+.global foo
+foo:
+ nop
+ bl log_func
+ b.n .L1
+ bl func
+
+.global func
+func:
+ nop
+.L1:
+ nop
--- /dev/null 2019-08-15 09:18:53.599929903 +0100
+++ gas/testsuite/gas/arm/pr24907.d 2019-08-15 17:24:13.543341674 +0100
@@ -0,0 +1,19 @@
+# name: Disassembling variable width insns with relocs (PR 24907)
+# as:
+# objdump: -d
+# This test is only valid on ELF based ports.
+#notarget: *-*-pe *-*-wince *-*-vxworks
+
+.*: +file format .*arm.*
+
+Disassembly of section \.text:
+
+0+000 <foo>:
+ 0: 46c0 nop ; .*
+ 2: f7ff fffe bl 0 <log_func>
+ 6: e002 b\.n e <func\+0x2>
+ 8: f7ff fffe bl c <func>
+
+0+000c <func>:
+ c: 46c0 nop ; .*
+ e: 46c0 nop ; .*