This is the mail archive of the gdb-patches@sourceware.org mailing list for the GDB project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH v4] Fix alignment of disassemble /r


On 04/11/2016 09:16 AM, Leonardo Boquillon wrote:
When disassembling in raw mode (/r) in a variable-length insn architecture (i.e. x86),
the output can be messed since no alignment takes place.

The first version of this was sent on this mail thread
https://sourceware.org/ml/gdb-patches/2014-04/msg00226.html and this patch is
the same placed previously here https://sourceware.org/bugzilla/show_bug.cgi?id=19768

This patch performs the two passes: the first is at get_insn_set_longest_opcode
at line 136 in the patch, and the second is the while loop that follows like was
agreed here https://sourceware.org/ml/gdb-patches/2014-04/msg00427.html

In this version have been added the capability to compute longest opcode for
btrace.

If this is ok for commit I have a company-wide copyright access, and a coworker
of mine has write access.

gdb/ChangeLog
2016-04-11  Leonardo Boquillon  <leonardo.boquillon@tallertechnologies.com>

	* disasm.c (gdb_pretty_print_insn): Refactored to use longest opcode.
	(dump_insns): Add calls to calculate longest opcode, then pass it to the
	print function.
	(dis_get_longest_opcode): New function.
	* disasm.h (gdb_pretty_print_insn): Refactored to use longest opcode.
	* record-btrace.c (btrace_get_longest_opcode): New function.
	(btrace_insn_history): Add calls to calculate longest opcode,
	then pass it to the print function.

gdb/testsuite/ChangeLog
2016-04-11  Leonardo Boquillon  <leonardo.boquillon@tallertechnologies.com>

	* disas_raw.exp: New file.
	* instruction_history.exp: Added test for disass /r.
	* main.S: New file.

---
  gdb/disasm.c                                     | 32 +++++++++++++++++--
  gdb/disasm.h                                     |  2 +-
  gdb/record-btrace.c                              | 27 +++++++++++++++-
  gdb/testsuite/gdb.btrace/instruction_history.exp | 12 ++++++++
  gdb/testsuite/gdb.disasm/disas_raw.exp           | 39 ++++++++++++++++++++++++
  gdb/testsuite/gdb.disasm/main.S                  | 10 ++++++
  6 files changed, 117 insertions(+), 5 deletions(-)
  create mode 100644 gdb/testsuite/gdb.disasm/disas_raw.exp
  create mode 100644 gdb/testsuite/gdb.disasm/main.S

diff --git a/gdb/disasm.c b/gdb/disasm.c
index 1cf0901..32324fc 100644
--- a/gdb/disasm.c
+++ b/gdb/disasm.c
@@ -169,19 +169,38 @@ compare_lines (const void *mle1p, const void *mle2p)
    return val;
  }

+static int
+dis_get_longest_opcode (struct gdbarch *gdbarch, CORE_ADDR addr, CORE_ADDR high,
+			int how_many)

The function should be documented.

+{
+  int longest_opcode = 0;
+  unsigned int insn_checked = 0;

Shouldn't longest_opcode be unsigned int since some of the other variables are also being forced to unsigned?

+
+  while (addr < high && (how_many < 0 || insn_checked++ < how_many))

Do we need to worry about wrong input here that would cause things to fail? Is addr always going to be less than high?

I'd prefer the insn_checked increment to be in the loop for readability.

+    {
+      const int current_length = gdb_insn_length(gdbarch, addr);

space before (. Multiple ocurrences of this.

Also, i don't quite see the need to use const here and in other places. It is such a short piece of code.

+      longest_opcode =
+	(current_length > longest_opcode) ? current_length : longest_opcode;
+      addr += current_length;
+    }
+
+  return longest_opcode;
+}
+
  /* See disasm.h.  */

  int
  gdb_pretty_print_insn (struct gdbarch *gdbarch, struct ui_out *uiout,
  		       struct disassemble_info * di,
  		       const struct disasm_insn *insn, int flags,
-		       struct ui_file *stb)
+		       struct ui_file *stb, int longest_opcode)

unsigned int longest_opcode? Multiple ocurrences of this.

  {
    /* parts of the symbolic representation of the address */
    int unmapped;
    int offset;
    int line;
    int size;
+  unsigned int max_print_space;
    struct cleanup *ui_out_chain;
    char *filename = NULL;
    char *name = NULL;
@@ -245,6 +264,7 @@ gdb_pretty_print_insn (struct gdbarch *gdbarch, struct ui_out *uiout,
        CORE_ADDR end_pc;
        bfd_byte data;
        int err;
+      unsigned int i;
        const char *spacer = "";

        /* Build the opcodes using a temporary stream so we can
@@ -267,7 +287,10 @@ gdb_pretty_print_insn (struct gdbarch *gdbarch, struct ui_out *uiout,
  	}

        ui_out_field_stream (uiout, "opcodes", opcode_stream);
-      ui_out_text (uiout, "\t");
+      gdb_assert(longest_opcode >= size);

This doesn't look like a fatal failure. I think we should just fallback to setting longest_opcode = size if this happens.

+      max_print_space = 3u * (1u + longest_opcode - size);
+      for (i = 0; i < max_print_space; i++)
+	ui_out_text(uiout, " ");

        do_cleanups (cleanups);
      }
@@ -291,15 +314,18 @@ dump_insns (struct gdbarch *gdbarch, struct ui_out *uiout,
  {
    struct disasm_insn insn;
    int num_displayed = 0;
+  int longest_opcode;

    memset (&insn, 0, sizeof (insn));
    insn.addr = low;

+  longest_opcode = dis_get_longest_opcode (gdbarch, low, high, how_many);
    while (insn.addr < high && (how_many < 0 || num_displayed < how_many))
      {
        int size;

-      size = gdb_pretty_print_insn (gdbarch, uiout, di, &insn, flags, stb);
+      size = gdb_pretty_print_insn (gdbarch, uiout, di, &insn, flags, stb,
+				    longest_opcode);
        if (size <= 0)
  	break;

diff --git a/gdb/disasm.h b/gdb/disasm.h
index a2b72b9..21cdd17 100644
--- a/gdb/disasm.h
+++ b/gdb/disasm.h
@@ -53,7 +53,7 @@ struct disasm_insn
  extern int gdb_pretty_print_insn (struct gdbarch *gdbarch, struct ui_out *uiout,
  				  struct disassemble_info * di,
  				  const struct disasm_insn *insn, int flags,
-				  struct ui_file *stb);
+				  struct ui_file *stb, int longest_opcode);

  /* Return a filled in disassemble_info object for use by gdb.  */

diff --git a/gdb/record-btrace.c b/gdb/record-btrace.c
index 77b5180..cb87aac 100644
--- a/gdb/record-btrace.c
+++ b/gdb/record-btrace.c
@@ -660,6 +660,28 @@ btrace_print_lines (struct btrace_line_range lines, struct ui_out *uiout,
      }
  }

+static int
+btrace_get_longest_opcode (struct gdbarch *gdbarch,
+                           const struct btrace_insn_iterator *begin,
+                           const struct btrace_insn_iterator *end)
+{
+  int longest_opcode = 0;
+  struct btrace_insn_iterator it;
+
+  for (it = *begin; btrace_insn_cmp (&it, end) != 0; btrace_insn_next (&it, 1))
+    {
+      const struct btrace_insn *insn = btrace_insn_get (&it);
+
+      if (insn != NULL)
+	{
+	  const int current_length = gdb_insn_length (gdbarch, insn->pc);
+	  longest_opcode =
+	    (current_length > longest_opcode) ? current_length : longest_opcode;
+	}
+    }
+
+  return longest_opcode;
+}
  /* Disassemble a section of the recorded instruction trace.  */

  static void
@@ -674,6 +696,7 @@ btrace_insn_history (struct ui_out *uiout,
    struct gdbarch *gdbarch;
    struct btrace_insn_iterator it;
    struct btrace_line_range last_lines;
+  int longest_opcode;

    DEBUG ("itrace (0x%x): [%u; %u)", flags, btrace_insn_number (begin),
  	 btrace_insn_number (end));
@@ -692,6 +715,7 @@ btrace_insn_history (struct ui_out *uiout,
       instructions corresponding to that line.  */
    ui_item_chain = NULL;

+  longest_opcode = btrace_get_longest_opcode(gdbarch, begin, end);
    for (it = *begin; btrace_insn_cmp (&it, end) != 0; btrace_insn_next (&it, 1))
      {
        const struct btrace_insn *insn;
@@ -745,7 +769,8 @@ btrace_insn_history (struct ui_out *uiout,
  	  if ((insn->flags & BTRACE_INSN_FLAG_SPECULATIVE) != 0)
  	    dinsn.is_speculative = 1;

-	  gdb_pretty_print_insn (gdbarch, uiout, &di, &dinsn, flags, stb);
+	  gdb_pretty_print_insn (gdbarch, uiout, &di, &dinsn, flags, stb,
+				 longest_opcode);
  	}
      }

diff --git a/gdb/testsuite/gdb.btrace/instruction_history.exp b/gdb/testsuite/gdb.btrace/instruction_history.exp
index 58ae770..7de35d8 100644
--- a/gdb/testsuite/gdb.btrace/instruction_history.exp
+++ b/gdb/testsuite/gdb.btrace/instruction_history.exp
@@ -97,6 +97,18 @@ gdb_test "record instruction-history /pf 3,7" [multi_line \
    "7\t   0x\[0-9a-f\]+ <\\+\[0-9\]+>:\tje     0x\[0-9a-f\]+ <loop\\+\[0-9\]+>\r" \
    ]

+# We don't know the exact encoding that is used but we know that the raw
+# opcode bytes should be padded so they should all be the same length.
+#
+# To simplify things let's assume that the longest opcode is 3B.
+gdb_test "record instruction-history /r 3,7" [multi_line \
+  "3\t   0x\[0-9a-f\]+ <loop\\+\[0-9\]+>:\t\[0-9a-f \]{8}   je     0x\[0-9a-f\]+ <loop\\+\[0-9\]+>" \
+  "4\t   0x\[0-9a-f\]+ <loop\\+\[0-9\]+>:\t\[0-9a-f \]{8}   dec    %eax" \
+  "5\t   0x\[0-9a-f\]+ <loop\\+\[0-9\]+>:\t\[0-9a-f \]{8}   jmp    0x\[0-9a-f\]+ <loop\\+\[0-9\]+>" \
+  "6\t   0x\[0-9a-f\]+ <loop\\+\[0-9\]+>:\t\[0-9a-f \]{8}   cmp    \\\$0x0,%eax" \
+  "7\t   0x\[0-9a-f\]+ <loop\\+\[0-9\]+>:\t\[0-9a-f \]{8}   je     0x\[0-9a-f\]+ <loop\\+\[0-9\]+>\r"
+  ]
+
  gdb_test "record instruction-history 3,3" "3\t   0x\[0-9a-f\]+ <loop\\+\[0-9\]+>:\tje     0x\[0-9a-f\]+ <loop\\+\[0-9\]+>\r"

  # the following tests are checking the iterators
diff --git a/gdb/testsuite/gdb.disasm/disas_raw.exp b/gdb/testsuite/gdb.disasm/disas_raw.exp
new file mode 100644
index 0000000..a233bad
--- /dev/null
+++ b/gdb/testsuite/gdb.disasm/disas_raw.exp
@@ -0,0 +1,39 @@
+# This testcase is part of GDB, the GNU debugger.
+#
+# Copyright 2016 Free Software Foundation, Inc.
+#
+# Contributed by Taller Technologies <leonardo.boquillon@tallertechnologies.com>
+#
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 3 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program.  If not, see <http://www.gnu.org/licenses/>.
+
+# Test alignment of raw disassemble dump (Bugzilla #19768).
+
+if { ![istarget "x86_64-*-*"] } { return -1 }
+
+set asm_file "main.S"
+
+standard_testfile ${asm_file}
+
+if [prepare_for_testing ${testfile}.exp ${testfile} ${asm_file} {}] {
+    return -1
+}
+
+set asm_line ".*55                        push   %rbp"
+append asm_line ".*48 89 e5                  mov    %rsp,%rbp"
+append asm_line ".*b8 00 00 00 00            mov    \\\$0x0,%eax"
+append asm_line ".*5d                        pop    %rbp"
+append asm_line ".*c3                        retq   "
+append asm_line ".*0f 1f 84 00 00 00 00 00   nopl   0x0\\(%rax,%rax,1\\).*"
+
+gdb_test "disass /r main" ${asm_line} "disass /r main"
\ No newline at end of file
diff --git a/gdb/testsuite/gdb.disasm/main.S b/gdb/testsuite/gdb.disasm/main.S
new file mode 100644
index 0000000..0178938
--- /dev/null
+++ b/gdb/testsuite/gdb.disasm/main.S
@@ -0,0 +1,10 @@
+	.file	"main.c"
+	.text
+	.globl	main
+	.type	main, @function
+main:
+	pushq	%rbp
+	movq	%rsp, %rbp
+	movl	$0, %eax
+	popq	%rbp
+	ret


It is always nice to have a testcase, but in this particular case i don't see the benefit. It will only be exercised for a particular architecture, even though it is a general change, and it may fail if we try to exercise things under a different host. For example, Windows.

Should we agree on a display format and declare that the correct output mechanism?

Also, should we only adjust the output and calculate the longest opcode for architectures that have variable length instructions? Architectures with fixed length instructions will not benefit from the additional calculations, right?


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]