[python][patch] Python rbreak

Phil Muldoon pmuldoon@redhat.com
Wed Oct 11 11:30:00 GMT 2017


This introduces a Python rbreak function to the Python API. Its
functionality was designed to closely match that of the console rbreak
but with a number of caveats.

That being said, there's a bit of preamble first.

My original design was to have the gdb.Breakpoint constructor take a
regex pattern. While breakpoints can, and do, have multiple locations
they are always tied in with one linespec. I looked at the
linespec/location code and the relationship is pretty firmly embedded
as a one to many relationship. One linespec can have multiple
locations but only one linespec can be represented in the location
object. (Or address or probe or whatever the location is
representing). A linespec has to be definitive; it has to point to one
symbol. I thought about modifying this to have multiple symbols and
multiple locations but quickly decided this was really going to
violate the design contract of linespecs/location. It would also be
considerable and disruptive work. I looked then, again, at a
breakpoint having multiple linespecs but again decided the amount of
disruption this would generate was not worth a Python rbreak returning
one breakpoint representing multiple symbols and multiple locations.

So, instead, I've made the Python rbreak functionality a little more
tuneable than the console command equivalent. The first tuneable is to
allow the user to exclude mini symbols from the pattern matching. A
loosely written pattern passed to the console rbreak command can
generate 5000+ breakpoints from a simple C "hello world"
program. rbreak "" will place a breakpoint on every single mini symbol
representing a function. I can't possibly see why this would be
desirable. In order to facilitate broadly-written patterns that seek to
place a breakpoint on every user-defined function, the equivalent
would be:

gdb.rbreak ("", minisyms=False)

That would place a breakpoint on all functions that are actually
defined in the inferior (and not those that are inserted by the
compiler, linker, etc). The default for this keyword is False.

The second tuneable is a throttle. Beyond the name (which I am unsure
about but could not think of a better one), this allows the user to
enter a fail-safe limit for breakpoint creation. So, for the following
example, an inferior with ten user provided functions:

gdb.rbreak ("", minisyms=False, throttle=5)

Would set no breakpoints and raise a runtime error. This functionality
is more aimed at those who want to utilise Python rbreak from a
toolset perspective (say for a GDB UI). The default for this keyword
is unlimited.

The last tuneable is the ability to limit the pattern matching to a
set of gdb.Symtab objects. So, given the example below:

gdb.rbreak ("", minisyms=False, throttle=5, symtabs=symTuple)

where "symTuple" is a tuple of gdb.Symtab objects, the search/pattern
matching would be restricted to those gdb.Symtab objects only. The
default for this keyword (i.e, no keyword provided) is to search all
symbol tables.

All of the examples above have "" as the search pattern but you can,
of course, specify any valid GDB regex there.

On a review note I seek a little advice on the interface to
search_symbols. If the user does provide a symtabs keyword, the code
collects the name of the file attached to each symtab specified. It
does that through the usual Python method
(python_string_to_target_string). That returns a
gdb::unique_xmalloc_ptr<char>. I planned to store these in a
std::vector, however, I was unable to coerce a
std::vector<gdb::unique_xmalloc_ptr<char>> into a char **p[]
representation the search_symbols function requires. I ended up
releasing the object (that results in a char *) and wrapping the
std::vector in a simple type that provides a destructor to properly
free the released strings when the object falls out of scope. I
thought about just changing the search_symbols interface to accept a
vector of (gdb) unique pointers but noted a C++ run had already been
performed over the function. Any thoughts here would be appreciated.

Long, windy, preamble over the patch follows at the end of the email.

Cheers

Phil

--

2017-10-11  Phil Muldoon  <pmuldoon@redhat.com>

	* python/python.c (gdbpy_rbreak): New function.

2017-10-11  Phil Muldoon  <pmuldoon@redhat.com>

	* gdb.python/py-rbreak.exp: New file.
	* gdb.python/py-rbreak.c: New file.
	* gdb.python/py-rbreak-func2.c: New file.

2017-10-11  Phil Muldoon  <pmuldoon@redhat.com>

	* python.texi (Basic Python): Add rbreak documentation.

--

diff --git a/gdb/doc/python.texi b/gdb/doc/python.texi
index f661e489bb..b2d4516886 100644
--- a/gdb/doc/python.texi
+++ b/gdb/doc/python.texi
@@ -243,6 +243,24 @@ were no breakpoints.  This peculiarity was subsequently fixed, and now
 @code{gdb.breakpoints} returns an empty sequence in this case.
 @end defun
 
+@findex gdb.rbreak
+@defun gdb.rbreak (regex @r{[}, minisyms @r{[}, throttle, @r{[}, symtabs @r{]]]})
+Return a Python tuple holding a collection of newly set
+@code{gdb.Breakpoint} objects matching function names defined by the
+@var{regex} pattern.  If the @var{minisyms} keyword is @code{True},
+all system functions (those not explicitly defined in the inferior)
+will also be included in the match.  The @var{throttle} keyword takes
+an integer that defines the maximum number of pattern matches for
+functions matched by the @var{regex} pattern.  If the number of
+matches exceeds the integer value of @var{throttle}, a
+@code{RuntimeError} will be raised and no breakpoints will be created.
+If @var{throttle} is not defined then there is no imposed limit on the
+maximum number of matches and breakpoints to be created.  The
+@var{symtabs} keyword takes a Python iterable that yields a collection
+of @code{gdb.Symtab} objects and will restrict the search to those
+functions only contained within the @code{gdb.Symtab} objects.
+@end defun
+
 @findex gdb.parameter
 @defun gdb.parameter (parameter)
 Return the value of a @value{GDBN} @var{parameter} given by its name,
diff --git a/gdb/python/python.c b/gdb/python/python.c
index b04057ec4a..4d591df30c 100644
--- a/gdb/python/python.c
+++ b/gdb/python/python.c
@@ -642,6 +642,191 @@ gdbpy_solib_name (PyObject *self, PyObject *args)
   return str_obj;
 }
 
+/* Implementation of Python rbreak command.  Take a REGEX and
+   optionally a MINISYMS, THROTTLE and SYMTABS keyword and return a
+   Python tuple that contains newly set breakpoints that match that
+   criteria.  REGEX refers to a GDB format standard regex pattern of
+   symbols names to search; MINISYMS is an optional boolean (default
+   False) that indicates if the function should search GDB's minimal
+   symbols; THROTTLE is an optional integer (default unlimited) that
+   indicates the maximum amount of breakpoints allowable before the
+   function exits (note, if the throttle bound is passed, no
+   breakpoints will be set and a runtime error returned); SYMTABS is
+   an optional iterator that contains a set of gdb.Symtabs to
+   constrain the search within.  */
+
+static PyObject *
+gdbpy_rbreak (PyObject *self, PyObject *args, PyObject *kw)
+{
+  /* A simple type to ensure clean up of a vector of allocated strings
+     when a C interface demands a const char *array[] type
+     interface.  */
+  struct symtab_list_type
+  {
+    ~symtab_list_type ()
+    {
+      for (const char *elem: vec)
+	xfree ((void *) elem);
+    }
+    std::vector<const char *> vec;
+  };
+
+  char *regex = NULL;
+  std::vector<symbol_search> symbols;
+  unsigned long count = 0;
+  PyObject *symtab_list = NULL;
+  PyObject *minisyms_p_obj = NULL;
+  int minisyms_p = 0;
+  unsigned int throttle = 0;
+  static const char *keywords[] = {"regex","minisyms", "throttle", "symtabs", NULL};
+  symtab_list_type symtab_paths;
+
+  if (!gdb_PyArg_ParseTupleAndKeywords (args, kw, "s|O!IO", keywords,
+					&regex, &PyBool_Type,
+					&minisyms_p_obj, &throttle,
+					&symtab_list))
+    return NULL;
+
+  /* Parse minisyms keyword.  */
+  if (minisyms_p_obj != NULL)
+    {
+      int cmp = PyObject_IsTrue (minisyms_p_obj);
+      if (cmp < 0)
+	return NULL;
+      minisyms_p = cmp;
+    }
+
+  /* The "symtabs" keyword is any Python iterable object that returns
+     a gdb.Symtab on each iteration.  If specified, iterate through
+     the provided gdb.Symtabs and extract their full path.  As
+     python_string_to_target_string returns a
+     gdb::unique_xmalloc_ptr<char> and a vector containing these types
+     cannot be coerced to a const char **p[] via the vector.data call,
+     release the value from the unique_xmalloc_ptr and place it in a
+     simple type symtab_list_type (which holds the vector and a
+     destructor that frees the contents of the allocated strings.  */
+  if (symtab_list != NULL)
+    {
+      gdbpy_ref<> iter (PyObject_GetIter (symtab_list));
+
+      if (iter == NULL)
+	return NULL;
+
+      while (true)
+	{
+	  gdbpy_ref<> next (PyIter_Next (iter.get ()));
+
+	  if (next == NULL)
+	    {
+	      if (PyErr_Occurred ())
+		return NULL;
+	      break;
+	    }
+
+	  gdbpy_ref<> obj_name (PyObject_GetAttrString (next.get (),
+							"filename"));
+
+	  if (obj_name == NULL)
+	    return NULL;
+
+	  /* Is the object file still valid?  */
+	  if (obj_name == Py_None)
+	    continue;
+
+	  gdb::unique_xmalloc_ptr<char> filename =
+	    python_string_to_target_string (obj_name.get ());
+
+	  if (filename == NULL)
+	    return NULL;
+
+	  /* Make sure there is a definite place to store the value of
+	     s before it is released.  */
+	  symtab_paths.vec.push_back (nullptr);
+	  symtab_paths.vec.back () = filename.release ();
+	}
+    }
+
+  if (symtab_list)
+    {
+      const char **files = symtab_paths.vec.data ();
+
+      symbols = search_symbols (regex, FUNCTIONS_DOMAIN,
+				symtab_paths.vec.size (), files);
+    }
+  else
+    symbols = search_symbols (regex, FUNCTIONS_DOMAIN, 0, NULL);
+
+  /* Count the number of symbols (both symbols and optionally mini
+     symbols) so we can correctly size the tuple.  */
+  for (const symbol_search &p : symbols)
+    {
+      /* Mini symbols included?  */
+      if (minisyms_p)
+	{
+	  if (p.msymbol.minsym != NULL)
+	    count++;
+	}
+
+      if (p.symbol != NULL)
+	count++;
+    }
+
+  /* Check throttle bounds and exit if in excess.  */
+  if (throttle != 0 && count > throttle)
+    {
+      PyErr_SetString (PyExc_RuntimeError,
+		       _("Number of breakpoints exceeds throttled maximum."));
+      return NULL;
+    }
+
+  gdbpy_ref<> return_tuple (PyTuple_New (count));
+
+  if (return_tuple == NULL)
+    return NULL;
+
+  count = 0;
+
+  /* Construct full path names for symbols and call the Python
+     breakpoint constructor on the resulting names.  Be tolerant of
+     individual breakpoint failures.  */
+  for (const symbol_search &p : symbols)
+    {
+      std::string symbol_name;
+
+      /* Skipping mini-symbols?  */
+      if (minisyms_p == 0)
+	if (p.msymbol.minsym != NULL)
+	  continue;
+
+      if (p.msymbol.minsym == NULL)
+	{
+	  struct symtab *symtab = symbol_symtab (p.symbol);
+	  const char *fullname = symtab_to_fullname (symtab);
+
+	  symbol_name = fullname;
+	  symbol_name  += ":";
+	  symbol_name  += SYMBOL_LINKAGE_NAME (p.symbol);
+	}
+      else
+	symbol_name = MSYMBOL_LINKAGE_NAME (p.msymbol.minsym);
+
+      gdbpy_ref<> argList (Py_BuildValue("(s)", symbol_name.c_str ()));
+      gdbpy_ref<> obj (PyObject_CallObject ((PyObject *)
+					    &breakpoint_object_type,
+					    argList.get ()));
+
+      /* Tolerate individual breakpoint failures.  */
+      if (obj == NULL)
+	gdbpy_print_stack ();
+      else
+	{
+	  PyTuple_SET_ITEM (return_tuple.get (), count, obj.get ());
+	  count++;
+	}
+    }
+  return return_tuple.release ();
+}
+
 /* A Python function which is a wrapper for decode_line_1.  */
 
 static PyObject *
@@ -1912,7 +2097,9 @@ Return the name of the current target charset." },
   { "target_wide_charset", gdbpy_target_wide_charset, METH_NOARGS,
     "target_wide_charset () -> string.\n\
 Return the name of the current target wide charset." },
-
+  { "rbreak", (PyCFunction) gdbpy_rbreak, METH_VARARGS | METH_KEYWORDS,
+    "rbreak (Regex) -> Tuple.\n\
+Return a Tuple containing gdb.Breakpoint objects that match the given Regex." },
   { "string_to_argv", gdbpy_string_to_argv, METH_VARARGS,
     "string_to_argv (String) -> Array.\n\
 Parse String and return an argv-like array.\n\
diff --git a/gdb/testsuite/gdb.python/py-rbreak-func2.c b/gdb/testsuite/gdb.python/py-rbreak-func2.c
new file mode 100644
index 0000000000..1de8bfd5b2
--- /dev/null
+++ b/gdb/testsuite/gdb.python/py-rbreak-func2.c
@@ -0,0 +1,31 @@
+/* This testcase is part of GDB, the GNU debugger.
+
+   Copyright 2017 Free Software Foundation, Inc.
+
+   This program is free software; you can redistribute it and/or modify
+   it under the terms of the GNU General Public License as published by
+   the Free Software Foundation; either version 3 of the License, or
+   (at your option) any later version.
+
+   This program is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+   GNU General Public License for more details.
+
+   You should have received a copy of the GNU General Public License
+   along with this program.  If not, see <http://www.gnu.org/licenses/>.  */
+
+int efunc1 ()
+{
+  return 1;
+}
+
+int efunc2 ()
+{
+  return 2;
+}
+
+int efunc3 ()
+{
+  return 3;
+}
diff --git a/gdb/testsuite/gdb.python/py-rbreak.c b/gdb/testsuite/gdb.python/py-rbreak.c
new file mode 100644
index 0000000000..287aba6c03
--- /dev/null
+++ b/gdb/testsuite/gdb.python/py-rbreak.c
@@ -0,0 +1,62 @@
+/* This testcase is part of GDB, the GNU debugger.
+
+   Copyright 2013-2017 Free Software Foundation, Inc.
+
+   This program is free software; you can redistribute it and/or modify
+   it under the terms of the GNU General Public License as published by
+   the Free Software Foundation; either version 3 of the License, or
+   (at your option) any later version.
+
+   This program is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+   GNU General Public License for more details.
+
+   You should have received a copy of the GNU General Public License
+   along with this program.  If not, see <http://www.gnu.org/licenses/>.  */
+
+int func1 ()
+{
+  return 1;
+}
+
+int func2 ()
+{
+  return 2;
+}
+
+int func3 ()
+{
+  return 3;
+}
+
+int func4 ()
+{
+  return 4;
+}
+
+int func5 ()
+{
+  return 5;
+}
+
+void func6 ()
+{
+  return;
+}
+
+void outside_scope ()
+{
+  return;
+}
+
+int main()
+{
+  func1 (); /* Break func1.  */
+  func2 ();
+  func3 ();
+  func4 ();
+  func5 ();
+  func6 ();
+  outside_scope ();
+}
diff --git a/gdb/testsuite/gdb.python/py-rbreak.exp b/gdb/testsuite/gdb.python/py-rbreak.exp
new file mode 100644
index 0000000000..9385aeae08
--- /dev/null
+++ b/gdb/testsuite/gdb.python/py-rbreak.exp
@@ -0,0 +1,61 @@
+# Copyright (C) 2017 Free Software Foundation, Inc.
+#
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 3 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program.  If not, see <http://www.gnu.org/licenses/>.
+
+# This file is part of the GDB testsuite.  It tests the mechanism
+# exposing values to Python.
+
+load_lib gdb-python.exp
+
+standard_testfile py-rbreak.c py-rbreak-func2.c
+
+if {[prepare_for_testing "failed to prepare" ${testfile} [list $srcfile $srcfile2]] } {
+    return 1
+}
+
+# Skip all tests if Python scripting is not enabled.
+if { [skip_python_tests] } { continue }
+
+if ![runto_main] then {
+    fail "can't run to main"
+    return 0
+}
+
+gdb_py_test_silent_cmd "py sl = gdb.rbreak(\"\",minisyms=False)" \
+    "Get all function breakpoints" 0
+gdb_test "py print(len(sl))" "11" \
+    "Check number of returned breakpoints is 11"
+gdb_py_test_silent_cmd "py sl = gdb.rbreak(\"main\.\*\",minisyms=False)" \
+    "Get main function breakpoint" 0
+gdb_test "py print(len(sl))" "1" \
+    "Check number of returned breakpoints is 1"
+gdb_py_test_silent_cmd "py sl = gdb.rbreak(\"func\.\*\",minisyms=False,throttle=10)" \
+    "Get functions matching func.*" 0
+gdb_test "py print(len(sl))" "9" \
+    "Check number of returned breakpoints is 9"
+gdb_test "py gdb.rbreak(\"func\.\*\",minisyms=False,throttle=1)" \
+    "Number of breakpoints exceeds throttled maximum.*" \
+    "Check throttle errors on too many breakpoints."
+gdb_py_test_silent_cmd "py sl = gdb.rbreak(\"func1\",minisyms=True)" \
+    "Including mini symbols, get functions matching func.*" 0
+gdb_test "py print(len(sl))" "2" \
+    "Check number of returned breakpoints is 2"
+gdb_py_test_silent_cmd "python sym = gdb.lookup_symbol(\"efunc1\")" \
+    "Find a symbol in objfile" 1
+gdb_py_test_silent_cmd "python symtab = sym\[0\].symtab" \
+    "Get backing symbol table" 1
+gdb_py_test_silent_cmd "py sl = gdb.rbreak(\"func\.\*\",minisyms=False,throttle=10,symtabs=\[symtab\])" \
+    "Get functions matching func.* in one symtab only." 0
+gdb_test "py print(len(sl))" "3" \
+    "Check number of returned breakpoints is 3"



More information about the Gdb-patches mailing list