[PATCH] [gdb/cli] Use debug info language to pick pygments lexer

Tom de Vries tdevries@suse.de
Mon Apr 7 09:19:29 GMT 2025


Consider the following scenario:
...
$ cat hello

int
main (void)
{
  printf ("hello\n");
  return 0;
}
$ gcc -x c hello -g
$ gdb -q -iex "maint set gnu-source-highlight enabled off" a.out
Reading symbols from a.out...
(gdb) start
Temporary breakpoint 1 at 0x4005db: file hello, line 6.
Starting program: /data/vries/gdb/a.out
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".

Temporary breakpoint 1, main () at hello:6
6	  printf ("hello\n");
...

This doesn't produce highlighting for line 6, because:
- pygments is used for highlighting instead of source-highlight, and
- pygments guesses the language for highlighting only based on the filename,
  which in this case doesn't give a clue.

Fix this by:
- adding a language string to the extension_language_ops.colorize interface,
- passing the language as found in the debug info, and
- using it in gdb.styling.colorize to pick the pygments lexer.

The new test-case gdb.python/py-source-styling-2.exp excercises a slightly
different scenario: it compiles a c++ file with a .c extension, and checks
that c++ highlighting is done instead of c highlighting.

Tested on x86_64-linux.

PR cli/30966
Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=30966
---
 gdb/extension-priv.h                          | 11 ++--
 gdb/extension.c                               |  5 +-
 gdb/extension.h                               | 10 ++--
 gdb/python/lib/gdb/styling.py                 | 10 +++-
 gdb/python/python.c                           | 14 ++++-
 gdb/source-cache.c                            |  3 +-
 .../gdb.python/py-source-styling-2.c          | 26 +++++++++
 .../gdb.python/py-source-styling-2.exp        | 55 +++++++++++++++++++
 8 files changed, 117 insertions(+), 17 deletions(-)
 create mode 100644 gdb/testsuite/gdb.python/py-source-styling-2.c
 create mode 100644 gdb/testsuite/gdb.python/py-source-styling-2.exp

diff --git a/gdb/extension-priv.h b/gdb/extension-priv.h
index a38f104d949..f7dd2a74999 100644
--- a/gdb/extension-priv.h
+++ b/gdb/extension-priv.h
@@ -262,12 +262,13 @@ struct extension_language_ops
      const char *method_name,
      std::vector<xmethod_worker_up> *dm_vec);
 
-  /* Colorize a source file.  NAME is the source file's name, and
-     CONTENTS is the contents of the file.  This should either return
-     colorized (using ANSI terminal escapes) version of the contents,
-     or an empty option.  */
+  /* Colorize a source file.  NAME is the source file's name, CONTENTS is the
+     contents of the file, and LANG may contain a string describing the
+     language.  This should either return colorized (using ANSI terminal
+     escapes) version of the contents, or an empty option.  */
   std::optional<std::string> (*colorize) (const std::string &name,
-					  const std::string &contents);
+					  const std::string &contents,
+					  const std::string &lang);
 
   /* Colorize a single line of disassembler output, CONTENT.  This should
      either return colorized (using ANSI terminal escapes) version of the
diff --git a/gdb/extension.c b/gdb/extension.c
index b78ea4f2716..6e0e7520011 100644
--- a/gdb/extension.c
+++ b/gdb/extension.c
@@ -974,7 +974,8 @@ xmethod_worker::get_result_type (value *object, gdb::array_view<value *> args)
 /* See extension.h.  */
 
 std::optional<std::string>
-ext_lang_colorize (const std::string &filename, const std::string &contents)
+ext_lang_colorize (const std::string &filename, const std::string &contents,
+		   const std::string &lang)
 {
   std::optional<std::string> result;
 
@@ -983,7 +984,7 @@ ext_lang_colorize (const std::string &filename, const std::string &contents)
       if (extlang->ops == nullptr
 	  || extlang->ops->colorize == nullptr)
 	continue;
-      result = extlang->ops->colorize (filename, contents);
+      result = extlang->ops->colorize (filename, contents, lang);
       if (result.has_value ())
 	return result;
     }
diff --git a/gdb/extension.h b/gdb/extension.h
index 957642a99dc..c1d5f0fa805 100644
--- a/gdb/extension.h
+++ b/gdb/extension.h
@@ -325,12 +325,14 @@ extern void get_matching_xmethod_workers
    std::vector<xmethod_worker_up> *workers);
 
 /* Try to colorize some source code.  FILENAME is the name of the file
-   holding the code.  CONTENTS is the source code itself.  This will
-   either a colorized (using ANSI terminal escapes) version of the
-   source code, or an empty value if colorizing could not be done.  */
+   holding the code.  CONTENTS is the source code itself.  LANG may contain a
+   string describing the language.  This will either a colorized (using ANSI
+   terminal escapes) version of the source code, or an empty value if
+   colorizing could not be done.  */
 
 extern std::optional<std::string> ext_lang_colorize
-  (const std::string &filename, const std::string &contents);
+  (const std::string &filename, const std::string &contents,
+   const std::string &lang);
 
 /* Try to colorize a single line of disassembler output, CONTENT for
    GDBARCH.  This will return either a colorized (using ANSI terminal
diff --git a/gdb/python/lib/gdb/styling.py b/gdb/python/lib/gdb/styling.py
index 1c5394e479b..2efaf4cb5e0 100644
--- a/gdb/python/lib/gdb/styling.py
+++ b/gdb/python/lib/gdb/styling.py
@@ -22,6 +22,7 @@ try:
     from pygments import formatters, highlight, lexers
     from pygments.filters import TokenMergeFilter
     from pygments.token import Comment, Error, Text
+    from pygments.util import ClassNotFound
 
     _formatter = None
 
@@ -31,10 +32,13 @@ try:
             _formatter = formatters.TerminalFormatter()
         return _formatter
 
-    def colorize(filename, contents):
+    def colorize(filename, contents, lang):
         # Don't want any errors.
         try:
-            lexer = lexers.get_lexer_for_filename(filename, stripnl=False)
+            try:
+                lexer = lexers.get_lexer_by_name(lang, stripnl=False)
+            except ClassNotFound:
+                lexer = lexers.get_lexer_for_filename(filename, stripnl=False)
             formatter = get_formatter()
             return highlight(contents, lexer, formatter).encode(
                 gdb.host_charset(), "backslashreplace"
@@ -94,7 +98,7 @@ try:
 
 except ImportError:
 
-    def colorize(filename, contents):
+    def colorize(filename, contents, lang):
         return None
 
     def colorize_disasm(content, gdbarch):
diff --git a/gdb/python/python.c b/gdb/python/python.c
index 2aaa30c7d8e..9536189d985 100644
--- a/gdb/python/python.c
+++ b/gdb/python/python.c
@@ -128,7 +128,8 @@ static bool gdbpy_check_quit_flag (const struct extension_language_defn *);
 static enum ext_lang_rc gdbpy_before_prompt_hook
   (const struct extension_language_defn *, const char *current_gdb_prompt);
 static std::optional<std::string> gdbpy_colorize
-  (const std::string &filename, const std::string &contents);
+  (const std::string &filename, const std::string &contents,
+   const std::string &lang);
 static std::optional<std::string> gdbpy_colorize_disasm
 (const std::string &content, gdbarch *gdbarch);
 static ext_lang_missing_file_result gdbpy_handle_missing_debuginfo
@@ -1295,7 +1296,8 @@ gdbpy_before_prompt_hook (const struct extension_language_defn *extlang,
 /* This is the extension_language_ops.colorize "method".  */
 
 static std::optional<std::string>
-gdbpy_colorize (const std::string &filename, const std::string &contents)
+gdbpy_colorize (const std::string &filename, const std::string &contents,
+		const std::string &lang)
 {
   if (!gdb_python_initialized)
     return {};
@@ -1329,6 +1331,13 @@ gdbpy_colorize (const std::string &filename, const std::string &contents)
       return {};
     }
 
+  gdbpy_ref<> lang_arg (PyUnicode_FromString (lang.c_str ()));
+  if (lang_arg == nullptr)
+    {
+      gdbpy_print_stack ();
+      return {};
+    }
+
   /* The pygments library, which is what we currently use for applying
      styling, is happy to take input as a bytes object, and to figure out
      the encoding for itself.  This removes the need for us to figure out
@@ -1349,6 +1358,7 @@ gdbpy_colorize (const std::string &filename, const std::string &contents)
   gdbpy_ref<> result (PyObject_CallFunctionObjArgs (hook.get (),
 						    fname_arg.get (),
 						    contents_arg.get (),
+						    lang_arg.get (),
 						    nullptr));
   if (result == nullptr)
     {
diff --git a/gdb/source-cache.c b/gdb/source-cache.c
index 30c9e619dae..216c8b98962 100644
--- a/gdb/source-cache.c
+++ b/gdb/source-cache.c
@@ -364,7 +364,8 @@ source_cache::ensure (struct symtab *s)
       if (!styled_p)
 	{
 	  std::optional<std::string> ext_contents;
-	  ext_contents = ext_lang_colorize (fullname, contents);
+	  std::string lang (language_str (s->language ()));
+	  ext_contents = ext_lang_colorize (fullname, contents, lang);
 	  if (ext_contents.has_value ())
 	    {
 	      contents = std::move (*ext_contents);
diff --git a/gdb/testsuite/gdb.python/py-source-styling-2.c b/gdb/testsuite/gdb.python/py-source-styling-2.c
new file mode 100644
index 00000000000..aaa3d694aab
--- /dev/null
+++ b/gdb/testsuite/gdb.python/py-source-styling-2.c
@@ -0,0 +1,26 @@
+/* This testcase is part of GDB, the GNU debugger.
+
+   Copyright 2025 Free Software Foundation, Inc.
+
+   This program is free software; you can redistribute it and/or modify
+   it under the terms of the GNU General Public License as published by
+   the Free Software Foundation; either version 3 of the License, or
+   (at your option) any later version.
+
+   This program is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+   GNU General Public License for more details.
+
+   You should have received a copy of the GNU General Public License
+   along with this program.  If not, see <http://www.gnu.org/licenses/>.  */
+
+int
+main ()
+{ /* List this line.  */
+  try
+    {}
+  catch (...)
+    {}
+  return 0;
+}
diff --git a/gdb/testsuite/gdb.python/py-source-styling-2.exp b/gdb/testsuite/gdb.python/py-source-styling-2.exp
new file mode 100644
index 00000000000..b13ee1f3d27
--- /dev/null
+++ b/gdb/testsuite/gdb.python/py-source-styling-2.exp
@@ -0,0 +1,55 @@
+# Copyright (C) 2025 Free Software Foundation, Inc.
+
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 3 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program.  If not, see <http://www.gnu.org/licenses/>.
+
+# Compile a c++ file using a .c extension, and check that pygments uses c++
+# highlighting instead of c highlighting.
+
+require allow_python_tests
+
+load_lib gdb-python.exp
+
+standard_testfile py-source-styling-2.c
+
+set line_number [gdb_get_line_number "List this line."]
+
+set opts {}
+lappend opts debug
+lappend opts c++
+
+if { [build_executable "failed to build" $testfile $srcfile $opts] == -1 } {
+    return
+}
+
+clean_restart
+
+gdb_test_no_output "maint set gnu-source-highlight enabled off"
+
+gdb_load $binfile
+
+require {gdb_py_module_available pygments}
+
+with_ansi_styling_terminal {
+    gdb_test_no_output "set style enabled on"
+
+    gdb_test_multiple "list $line_number" "Styling of c++ keyword try" {
+	-re -wrap "  try\r\n.*" {
+	    # Unstyled.
+	    fail $gdb_test_name
+	}
+	-re -wrap "" {
+	    pass $gdb_test_name
+	}
+    }
+}

base-commit: c2f55040d34784a5c40d25f6a58615da1b4a52be
-- 
2.43.0



More information about the Gdb-patches mailing list