Bug 30276 - [gdb/symtab] function name is _Dmain instead of "D main"
Summary: [gdb/symtab] function name is _Dmain instead of "D main"
Status: RESOLVED FIXED
Alias: None
Product: gdb
Classification: Unclassified
Component: symtab (show other bugs)
Version: HEAD
: P2 normal
Target Milestone: 15.1
Assignee: Tom Tromey
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2023-03-27 10:46 UTC by Tom de Vries
Modified: 2024-04-02 20:08 UTC (History)
1 user (show)

See Also:
Host:
Target:
Build:
Last reconfirmed:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Tom de Vries 2023-03-27 10:46:45 UTC
Consider the test-case src/gdb/testsuite/gdb.dlang/simple.d, compiled with the dmd compiler and debug info:
...
$ dmd src/gdb/testsuite/gdb.dlang/simple.d -g
...

When doing start, we stop at "_Dmain ()":
...
$ gdb -q -batch simple -ex start 
Temporary breakpoint 1 at 0x43844e: file src/gdb/testsuite/gdb.dlang/simple.d, line 17.
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".

Temporary breakpoint 1, _Dmain () at src/gdb/testsuite/gdb.dlang/simple.d:17
17      }
...

In contrast, without debug info we have instead "D main ()":
...
$ gdb -q -batch simple -ex start 
Temporary breakpoint 1 at 0x438448
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".

Temporary breakpoint 1, 0x0000000000438448 in D main ()
...

It seems that gdb could know the name, given that the debug info contains:
...
 <1><10a>: Abbrev Number: 3 (DW_TAG_subprogram)
    <10b>   DW_AT_name        : D main
    <112>   DW_AT_MIPS_linkage_name: _Dmain
...
Comment 1 Tom de Vries 2023-03-27 10:51:10 UTC
Using this patch:
...
diff --git a/gdb/dwarf2/read.c b/gdb/dwarf2/read.c
index c910be875a3..ea81d75b983 100644
--- a/gdb/dwarf2/read.c
+++ b/gdb/dwarf2/read.c
@@ -18812,7 +18812,7 @@ new_symbol (struct die_info *die, struct type *type, struct dwarf2_cu *cu,
       /* Fortran does not have mangling standard and the mangling does differ
         between gfortran, iFort etc.  */
       const char *physname
-       = (cu->lang () == language_fortran
+       = ((cu->lang () == language_fortran || cu->lang () == language_d)
           ? dwarf2_full_name (name, die, cu)
           : dwarf2_physname (name, die, cu));
       const char *linkagename = dw2_linkage_name (die, cu);
...
we have:
...
$ gdb -q -batch simple -ex start 
Temporary breakpoint 1 at 0x43844e: file src/gdb/testsuite/gdb.dlang/simple.d, line 17.
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
...
Temporary breakpoint 1, D main () at src/gdb/testsuite/gdb.dlang/simple.d:17
17      }
...

But I have no idea whether this is correct.  Maybe this needs to be handled somehow in dwarf2_physname instead?
Comment 2 Tom de Vries 2023-03-27 11:29:58 UTC
(In reply to Tom de Vries from comment #1)
> But I have no idea whether this is correct.  Maybe this needs to be handled
> somehow in dwarf2_physname instead?

In dwarf2_physname we do:
...
dwarf2_physname (name=0x364ce9b "D main", die=0x35deb80, cu=0x2bc90f0) at /data/vries/gdb/src/gdb/dwarf2/read.c:7097
7097      struct objfile *objfile = cu->per_objfile->objfile;
(gdb) n
7098      const char *retval, *mangled = NULL, *canon = NULL;
(gdb) 
7099      int need_copy = 1;
(gdb) 
7103      if (!die_needs_namespace (die, cu))
(gdb) 
7106      if (cu->lang () != language_rust)
(gdb) 
7107        mangled = dw2_linkage_name (die, cu);
(gdb) 
7111      gdb::unique_xmalloc_ptr<char> demangled;
(gdb) 
7112      if (mangled != NULL)
(gdb) 
7114          if (cu->language_defn->store_sym_names_in_linkage_form_p ())
(gdb) 
7128              demangled = gdb_demangle (mangled, (DMGL_PARAMS | DMGL_ANSI
(gdb) 
7131          if (demangled)
(gdb) 
7135              canon = mangled;
(gdb) 
7136              need_copy = 0;
(gdb) 
7140      if (canon == NULL || check_physname)
(gdb) 
7168        retval = canon;
(gdb) 
7170      if (need_copy)
(gdb) 
7173      return retval;
(gdb) p retval
$8 = 0x364cea2 "_Dmain"
...
Comment 3 Tom de Vries 2023-03-27 11:30:19 UTC
FTR, I've also tried this:
...
diff --git a/gdb/d-lang.c b/gdb/d-lang.c
index 8d1bdd05677..53dcbab80b0 100644
--- a/gdb/d-lang.c
+++ b/gdb/d-lang.c
@@ -142,6 +142,11 @@ class d_language : public language_defn
     return d_demangle (mangled, options);
   }
 
+  bool store_sym_names_in_linkage_form_p () const override
+  {
+    return true;
+  }
+
   /* See language.h.  */
 
   bool can_print_type_offsets () const override
...
and that didn't work.
Comment 4 Tom de Vries 2023-03-27 11:37:23 UTC
(In reply to Tom de Vries from comment #2)
> (In reply to Tom de Vries from comment #1)
> > But I have no idea whether this is correct.  Maybe this needs to be handled
> > somehow in dwarf2_physname instead?
> 
> In dwarf2_physname we do:
> ...
> dwarf2_physname (name=0x364ce9b "D main", die=0x35deb80, cu=0x2bc90f0) at
> /data/vries/gdb/src/gdb/dwarf2/read.c:7097
> 7097      struct objfile *objfile = cu->per_objfile->objfile;
> (gdb) n
> 7098      const char *retval, *mangled = NULL, *canon = NULL;
> (gdb) 
> 7099      int need_copy = 1;
> (gdb) 
> 7103      if (!die_needs_namespace (die, cu))
> (gdb) 
> 7106      if (cu->lang () != language_rust)
> (gdb) 
> 7107        mangled = dw2_linkage_name (die, cu);
> (gdb) 
> 7111      gdb::unique_xmalloc_ptr<char> demangled;
> (gdb) 
> 7112      if (mangled != NULL)
> (gdb) 
> 7114          if (cu->language_defn->store_sym_names_in_linkage_form_p ())
> (gdb) 
> 7128              demangled = gdb_demangle (mangled, (DMGL_PARAMS | DMGL_ANSI
> (gdb) 
> 7131          if (demangled)
> (gdb) 
> 7135              canon = mangled;
> (gdb) 
> 7136              need_copy = 0;
> (gdb) 
> 7140      if (canon == NULL || check_physname)
> (gdb) 
> 7168        retval = canon;
> (gdb) 
> 7170      if (need_copy)
> (gdb) 
> 7173      return retval;
> (gdb) p retval
> $8 = 0x364cea2 "_Dmain"
> ...

So, what I understand happens, we try to demangle the symbol using gdb_demangle, but that doesn't work because we're not passing the DMGL_DLANG.  So also this works:
...
diff --git a/gdb/dwarf2/read.c b/gdb/dwarf2/read.c
index c910be875a3..3debc1c1848 100644
--- a/gdb/dwarf2/read.c
+++ b/gdb/dwarf2/read.c
@@ -7126,7 +7126,7 @@ dwarf2_physname (const char *name, struct die_info *die, struct dwarf2_cu *cu)
             the only disadvantage remains the minimal symbol variant
             `long name(params)' does not have the proper inferior type.  */
          demangled = gdb_demangle (mangled, (DMGL_PARAMS | DMGL_ANSI
-                                             | DMGL_RET_DROP));
+                                             | DMGL_RET_DROP | DMGL_DLANG));
        }
       if (demangled)
        canon = demangled.get ();
...
But that breaks things for c++.
Comment 5 Tom de Vries 2023-03-27 11:53:58 UTC
To excercise this, run test-case gdb.dlang/dlang-start-2.exp, dmd installation not needed.
Comment 6 Tom Tromey 2023-03-27 15:05:29 UTC
It's hard to untangle the naming mess in the DWARF reader.
If demangling really has to be done (IMO it does not, except
for Ada), then it has to be done via the per-language demangler.
Otherwise you get clashes.

The new indexer uses a much simpler scheme for building names
and I think the full reader should do the same.
Comment 7 Tom Tromey 2024-03-30 20:07:41 UTC
This particular issue came up again in bug#31580.
I'll file or find a separate bug for the physname stuff
more generically.
Comment 8 Tom Tromey 2024-03-30 20:09:43 UTC
Sending a patch.
Comment 10 Sourceware Commits 2024-04-02 20:06:57 UTC
The master branch has been updated by Tom Tromey <tromey@sourceware.org>:

https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;h=b1741ab0dafd899889faab6e862094a325a6b83c

commit b1741ab0dafd899889faab6e862094a325a6b83c
Author: Tom Tromey <tom@tromey.com>
Date:   Sat Mar 30 13:48:30 2024 -0600

    libiberty: Invoke D demangler when --format=auto
    
    Investigating GDB PR d/31580 showed that the libiberty demangler
    doesn't automatically demangle D mangled names.  However, I think it
    should -- like C++ and Rust (new-style), D mangled names are readily
    distinguished by the leading "_D", and so the likelihood of confusion
    is low.  The other non-"auto" cases in this code are Ada (where the
    encoded form could more easily be confused by ordinary programs) and
    Java (which is long gone, but which also shared the C++ mangling and
    thus was just an output style preference).
    
    This patch also fixed another GDB bug, though of course that part
    won't apply to the GCC repository.
    
    Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=31580
    Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=30276
    
    libiberty
            * cplus-dem.c (cplus_demangle): Try the D demangler with
            "auto" format.
            * testsuite/d-demangle-expected: Add --format=auto test.
Comment 11 Tom Tromey 2024-04-02 20:08:57 UTC
Fixed.