[RFC - Python Scripting] New method gdb.Architecture.disassemble

Siva Chandra sivachandra@google.com
Thu Feb 14 22:46:00 GMT 2013


I now have a patch (find it attached) which has been tweaked based on
comments from Doug.

Thanks Doug, for another round of detailed feedback.

2013-02-13  Siva Chandra Reddy  <sivachandra@google.com>

        Add a new method 'disassemble' to gdb.Architecture class.
        * python/py-arch.c (archpy_disassmble): Implementation of the
        new method gdb.Architecture.disassemble.
        (arch_object_methods): Add entry for the new method.

doc/

        * gdb.texinfo (Architectures In Python): Add description about
        the new method gdb.Architecture.disassemble.

testsuite/

        * gdb.python/py-arch.c: New test case
        * gdb.python/py-arch.exp: New tests to test
        gdb.Architecture.disassemble
        * gdb.python/Makefile.in: Add py-arch to the list of
        EXECUTABLES.

On Wed, Feb 13, 2013 at 12:42 PM, Doug Evans <dje@google.com> wrote:
> Reading beyond high seems counterintuitive, but I can understand why it's easiest if it works that way.

I made it that way because I used the way 'disassemble' command works
as my guideline for this Python disassemble feature. However, I made
Python API to disassemble instructions which have a start address in
[start, end], while the disassemble command does the same in [start,
end).

> Also, I can imagine a user wanting to specify a count instead of high.
> How about supporting both "high" and "count", with the default being high is unspecified and count=1?

I have now added a count argument.

> Including the function name and offset in the output feels wrong.
> It's trying to make the API call do too much.

Removed in the attached patch.

As I have mentioned above, I used the 'functionality' of the
'disassemble' command as a guide. Hence, I had put in whatever it has.
For my needs, all I want are 'addr' and 'asm', and the rest were only
good to have.

> Plus, the documentation needs to specify what "flavor" is used (e.g. intel vs att).
> I'd just add a line of texzt saying the flavor (or choose a better word) is the one specified by the CLI variable disassembly-flavor,
> and leave for another day adding an input parameter to specify the flavor.

Added a note to the docs.

I am not very sure if we should (in future) add the 'flavor' as an
argument to the disassemble method. I prefer if it were a global
function in the 'gdb' module. Again, this is my opinion as I
personally do not see why one would want to have assembly code from
different architectures in different flavors.

> Why call build_address_symbol here?
> To me it's trying to make this API call do too much.

Removed now.

> btw, what about filename,line isn't useful?

It is not that they are not useful, but that build_address_symbol does
not return useful values. I have verified this with experiments but
have not digged into the code to figure out why. There is a similar
not in disasm.c:dump_insns.

> So I can imagine 3 API routines here:
> - disassemble (pc)
> - get_function_and_offset (pc) [or some such]
> - get_file_and_line (pc) [assuming gdbpy_find_pc_line doesn't always do what one wants]

I agree. I will add the last two to my TODO list.

Thanks,
Siva Chandra
-------------- next part --------------
diff --git a/gdb/doc/gdb.texinfo b/gdb/doc/gdb.texinfo
index e3f336e..cbb64c4 100644
--- a/gdb/doc/gdb.texinfo
+++ b/gdb/doc/gdb.texinfo
@@ -26040,6 +26040,38 @@ A @code{gdb.Architecture} class has the following methods:
 Return the name (string value) of the architecture.
 @end defun
 
+@defun Architecture.disassemble (@var{start_pc} @r{[}, @var{end_pc} @r{[}, @var{count}@r{]]})
+Return a list of utmost @var{count} number of disassembled instructions
+whose start address falls in the closed memory address interval from
+@var{start_pc} to @var{end_pc}.  If @var{end_pc} is not specified, but
+@var{count} is specified, then @var{count} number of instructions
+starting from the address @var{start_pc} are returned.  If @var{count}
+is not specified but @var{end_pc} is specified, then all instructions
+whose start address falls in the closed memory address interval from
+@var{start_pc} to @var{end_pc} are returned.  If neither @var{end_pc}
+nor @var{count} are specified, then a single instruction at
+@var{start_pc} is returned.  For all of these cases, the elements of the
+returned list are a Python @code{dict} with the following string keys:
+
+@table @code
+
+@item addr
+The value corresponding to this key is a Python long integer capturing
+the memory address of the instruction.
+
+@item asm
+The value corresponding to this key is a string value which represents
+the instruction with assembly language mnemonics.  The assembly
+language flavor used is the same as that specified by the current CLI
+variable @code{disassembly-flavor}.  @xref{Machine Code}.
+
+@item length
+The value correspoding to this key is the length (integer value) of the
+instruction in bytes.
+
+@end table
+@end defun
+
 @node Python Auto-loading
 @subsection Python Auto-loading
 @cindex Python auto-loading
diff --git a/gdb/python/py-arch.c b/gdb/python/py-arch.c
index edd508f..7f3e85c 100644
--- a/gdb/python/py-arch.c
+++ b/gdb/python/py-arch.c
@@ -20,6 +20,7 @@
 #include "defs.h"
 #include "gdbarch.h"
 #include "arch-utils.h"
+#include "disasm.h"
 #include "python-internal.h"
 
 typedef struct arch_object_type_object {
@@ -86,6 +87,131 @@ archpy_name (PyObject *self, PyObject *args)
   return py_name;
 }
 
+/* Implementation of
+   gdb.Architecture.disassemble (self, start_pc [, end_pc [,count]]) -> List.
+   Returns a list of instructions in a memory address range.  Each instruction
+   in the list is a Python dict object.
+*/
+
+static PyObject *
+archpy_disassemble (PyObject *self, PyObject *args, PyObject *kw)
+{
+  static char *keywords[] = { "start_pc", "end_pc", "count", NULL };
+  CORE_ADDR start, end = 0;
+  CORE_ADDR pc;
+  long count = 0, i = 1;
+  PyObject *result_list, *end_obj = NULL, *count_obj = NULL;
+  struct gdbarch *gdbarch = arch_object_to_gdbarch (self);
+
+  if (!PyArg_ParseTupleAndKeywords (args, kw, GDB_PY_LLU_ARG "|OO", keywords,
+                                    &start, &end_obj, &count_obj))
+    return NULL;
+
+  if (end_obj)
+    {
+      if (PyObject_TypeCheck (end_obj, &PyInt_Type))
+        /* If the end_pc value is specified without a trailing 'L', end_obj will
+           an integer and not a long integer.  */
+        end = PyInt_AsLong (end_obj);
+      else if (PyObject_TypeCheck (end_obj, &PyLong_Type))
+        end = PyLong_AsUnsignedLongLong (end_obj);
+      else
+        {
+          Py_DECREF (end_obj);
+          Py_XDECREF (count_obj);
+          PyErr_SetString (PyExc_TypeError,
+                           _("Argument 'end_pc' should be a (long) integer."));
+
+          return NULL;
+        }
+    }
+  if (count_obj)
+    {
+      count = PyInt_AsLong (count_obj);
+      if (PyErr_Occurred ())
+        {
+          Py_DECREF (count_obj);
+          Py_XDECREF (end_obj);
+          PyErr_SetString (PyExc_TypeError,
+                           _("Argument 'count' should be an integer."));
+
+          return NULL;
+        }
+    }
+
+  result_list = PyList_New (0);
+  if (result_list == NULL)
+    return NULL;
+
+  for (pc = start;
+       /* All args are specified.  */
+       (end_obj && count_obj && pc <= end && i <= count)
+       /* end_pc is specified, but no count.  */
+       || (end_obj && count_obj == NULL && pc <= end)
+       /* end_pc is not specified, but a count is.  */
+       || (end_obj == NULL && count_obj && i <= count)
+       /* Both end_pc and count are not specified.  */
+       || (end_obj == NULL && count_obj == NULL && pc <= start);)
+    {
+      int insn_len = 0;
+      char *as = NULL;
+      struct ui_file *memfile = mem_fileopen ();
+      PyObject *insn_dict = PyDict_New ();
+      volatile struct gdb_exception except;
+
+      if (insn_dict == NULL)
+        {
+          Py_DECREF (result_list);
+          ui_file_delete (memfile);
+
+          return NULL;
+        }
+      if (PyList_Append (result_list, insn_dict))
+        {
+          Py_DECREF (result_list);
+          Py_DECREF (insn_dict);
+          ui_file_delete (memfile);
+
+          return NULL;  /* PyList_Append Sets the exception.  */
+        }
+
+      TRY_CATCH (except, RETURN_MASK_ALL)
+        {
+          insn_len = gdb_print_insn (gdbarch, pc, memfile, NULL);
+        }
+      if (except.reason < 0)
+        {
+          Py_DECREF (result_list);
+          ui_file_delete (memfile);
+
+          return gdbpy_convert_exception (except);
+        }
+
+      as = ui_file_xstrdup (memfile, NULL);
+      if (PyDict_SetItemString (insn_dict, "addr",
+                                gdb_py_long_from_ulongest (pc))
+          || PyDict_SetItemString (insn_dict, "asm",
+                                   PyString_FromString (as ? as : "<unknown>"))
+          || PyDict_SetItemString (insn_dict, "length",
+                                   PyInt_FromLong (insn_len)))
+        {
+          Py_DECREF (result_list);
+
+          ui_file_delete (memfile);
+          xfree (as);
+
+          return NULL;
+        }
+
+      pc += insn_len;
+      i++;
+      ui_file_delete (memfile);
+      xfree (as);
+    }
+
+  return result_list;
+}
+
 /* Initializes the Architecture class in the gdb module.  */
 
 void
@@ -105,6 +231,10 @@ static PyMethodDef arch_object_methods [] = {
   { "name", archpy_name, METH_NOARGS,
     "name () -> String.\n\
 Return the name of the architecture as a string value." },
+  { "disassemble", (PyCFunction) archpy_disassemble,
+    METH_VARARGS | METH_KEYWORDS,
+    "disassemble (low, high) -> List.\n\
+Return the list of disassembled instructions from LOW to HIGH." },
   {NULL}  /* Sentinel */
 };
 
diff --git a/gdb/testsuite/gdb.python/Makefile.in b/gdb/testsuite/gdb.python/Makefile.in
index 4e286b5..0b81507 100644
--- a/gdb/testsuite/gdb.python/Makefile.in
+++ b/gdb/testsuite/gdb.python/Makefile.in
@@ -6,7 +6,7 @@ EXECUTABLES = py-type py-value py-prettyprint py-template py-block \
 	py-shared python lib-types py-events py-evthreads py-frame \
 	py-mi py-pp-maint py-progspace py-section-script py-objfile \
 	py-finish-breakpoint py-finish-breakpoint2 py-value-cc py-explore \
-	py-explore-cc
+	py-explore-cc py-arch
 
 MISCELLANEOUS = py-shared-sl.sl py-events-shlib.so py-events-shlib-nodebug.so 
 
diff --git a/gdb/testsuite/gdb.python/py-arch.c b/gdb/testsuite/gdb.python/py-arch.c
new file mode 100644
index 0000000..e2fe55c
--- /dev/null
+++ b/gdb/testsuite/gdb.python/py-arch.c
@@ -0,0 +1,23 @@
+/* This testcase is part of GDB, the GNU debugger.
+
+   Copyright 2013 Free Software Foundation, Inc.
+
+   This program is free software; you can redistribute it and/or modify
+   it under the terms of the GNU General Public License as published by
+   the Free Software Foundation; either version 3 of the License, or
+   (at your option) any later version.
+
+   This program is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+   GNU General Public License for more details.
+
+   You should have received a copy of the GNU General Public License
+   along with this program.  If not, see  <http://www.gnu.org/licenses/>.
+*/
+
+int
+main (void)
+{
+  return 0;
+}
diff --git a/gdb/testsuite/gdb.python/py-arch.exp b/gdb/testsuite/gdb.python/py-arch.exp
new file mode 100644
index 0000000..4e736b8
--- /dev/null
+++ b/gdb/testsuite/gdb.python/py-arch.exp
@@ -0,0 +1,54 @@
+# Copyright 2013 Free Software Foundation, Inc.
+
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 3 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program.  If not, see <http://www.gnu.org/licenses/>.
+
+standard_testfile
+
+if { [prepare_for_testing ${testfile}.exp ${testfile} ${srcfile}] } {
+    return -1
+}
+
+# Skip all tests if Python scripting is not enabled.
+if { [skip_python_tests] } { continue }
+
+if ![runto_main] {
+   return -1
+}
+
+gdb_py_test_silent_cmd "python frame = gdb.selected_frame()" "get frame" 0
+gdb_py_test_silent_cmd "python arch = frame.architecture()" "get arch" 0
+gdb_py_test_silent_cmd "python pc = frame.pc()" "get pc" 0
+gdb_py_test_silent_cmd "python insn_list1 = arch.disassemble(pc, pc, 1)" \
+  "disassemble" 0
+gdb_py_test_silent_cmd "python insn_list2 = arch.disassemble(pc, pc)" \
+  "disassemble no count" 0
+gdb_py_test_silent_cmd "python insn_list3 = arch.disassemble(pc, count=1)" \
+  "disassemble no end" 0
+gdb_py_test_silent_cmd "python insn_list4 = arch.disassemble(pc)" \
+  "disassemble no end no count" 0
+
+gdb_test "python print len(insn_list1)" "1" "test number of instructions 1"
+gdb_test "python print len(insn_list2)" "1" "test number of instructions 2"
+gdb_test "python print len(insn_list3)" "1" "test number of instructions 3"
+gdb_test "python print len(insn_list4)" "1" "test number of instructions 4"
+
+gdb_py_test_silent_cmd "python insn = insn_list1\[0\]" "get instruction" 0
+
+gdb_test "python print \"addr\" in insn" "True" "test key addr"
+gdb_test "python print \"asm\" in insn" "True" "test key asm"
+gdb_test "python print \"length\" in insn" "True" "test key length"
+
+# Negative test
+gdb_test "python arch.disassemble(0, 0)" ".*gdb\.MemoryError.*" \
+  "test exception"


More information about the Gdb-patches mailing list