Bug 30023 - Support template base names in gdb_index
Summary: Support template base names in gdb_index
Status: UNCONFIRMED
Alias: None
Product: gdb
Classification: Unclassified
Component: c++ (show other bugs)
Version: HEAD
: P2 normal
Target Milestone: ---
Assignee: Not yet assigned to anyone
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2023-01-18 21:57 UTC by David Blaikie
Modified: 2025-01-29 15:49 UTC (History)
1 user (show)

See Also:
Host:
Target:
Build:
Last reconfirmed:
Project(s) to access:
ssh public key:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description David Blaikie 2023-01-18 21:57:03 UTC
I recently implemented a new feature/option in Clang, -gsimple-template-names, which causes Clang to produce DWARF for C++ templates (class and function) with the base name (eg: "t1") only, instead of including the template parameters (eg: "t1<int>"). The template parameters are described by the DW_TAG_template_*_parameter DIEs and full human-readable names can be rebuilt from those, the same as is done for the namespaces the template is within, or the type of a function.

(side note: GCC already uses simple names for alias templates, but doesn't include template parameters: (eg: "template<typename T> using z = y<T>; z<int> var;" produces a DW_TAG_typedef named "z" with no template parameters (but the DW_AT_type correctly refers to "y<int>") and for variable templates (also missing template parameters): https://godbolt.org/z/WKvc3oxah - no mentions of "T1" and "V1" here - presumably just bugs, but may raise interesting questions about moving towards simplified template names more generally.

Clang (without -gsimple-template-names) matches GCC in using the simple name for the variable template - but includes the DW_TAG_template_*_parameter. Clang uses the unsimplified name for the alias template ("y<int>") but does not include a DW_TAG_template_*_parameter... )

In any case - all this mostly works, but gdb's index handling doesn't cope well with this situation - specifically, if the index includes the simplified name, gdb doesn't resolve the type from declaration to definition across files. Incidentally loading in the CU containing the definition does cause GDB to link the decl and definition correctly.

Here's an example:
a.h
```
template<typename T>
struct demo_type_templ {
  int member;
};
demo_type_templ<int> *get_d();
```
a.cpp:
```
#include "a.h"
int main() {
  demo_type_templ<int> *d = get_d();
  return 0;
}
```
b.cpp:
```
#include "a.h"
demo_type_templ<int> *get_d() {
  static demo_type_templ<int> d = {3};
  return &d;
}
```
```
$ clang++ -ggnu-pubnames -fuse-ld=lld -Wl,--gdb-index a.cpp b.cpp -g -o demo -gsimple-template-names 
$ gdb -batch -ex "b 4" -ex r -ex "ptype *d" -ex "quit" ./demo```
...
Breakpoint 1, main () at a.cpp:4
4         return 0;
type = struct demo_type_templ<int> {
    <incomplete type>
}
```
Without the index:
```
Breakpoint 1, main () at a.cpp:4
4         return 0;
type = struct demo_type_templ<int> [with T = int] {
    T member;
}
```

If I compare the index created from Clang (via -ggnu-pubnames/lld -Wl,--gdb-index) with the one generated by `gdb-add-index`:

```
--- good.txt    2023-01-18 21:52:29.705950886 +0000
+++ bad.txt     2023-01-18 21:52:11.917974796 +0000
@@ -1,11 +1,11 @@
 
 .gdb_index
-  Version             : 0x00000008
+  Version             : 0x00000007
   CU list offset      : 0x00000018
   Address area offset : 0x00000038
   Symbol table offset : 0x00000038
   Constant pool offset: 0x00002060
-  section size        : 0x000020a4
+  section size        : 0x000020b4
   CU list. array length: 2 format: [entry#] cuoffset culength
     [   0] 0x00000000 0x00000051
     [   1] 0x00000051 0x0000005f
@@ -20,8 +20,11 @@
   Symbol table: length 1024 format: [entry#] symindex cuindex [type] "name" or 
                            format: [entry#]  "name" , list of  cuindex [type]
   [ 106]   1 [global  function(3) ] "get_d"
-  [ 196]   1 [global  type(1)     ] "demo_type_templ<int>"
+  [ 258]   1 [global  type(1)     ] "demo_type_templ"
   [ 489]   0 [global  function(3) ] "main"
-  [ 754]   0 [static  type(1)     ] "int"
+  [ 754] "int"
+            0 [static  type(1)     ]
+            1 [static  type(1)     ]
+  [ 973]   1 [static  var-enum(2) ] "get_d::d"
 
```

So, a few differences, but presumably the critical one is the name for the template instantiation.

Clang could be made to produce the unsimplified name for ggnu-pubnames, but that'd regress some of the space savings of this mode & it'd be nice to have some more consistency between DW_AT_name and index entry.

It'd be also good to agree on this between clang/gcc/gdb/lldb especially for DWARFv5 .debug_names so that those are generally usable.

The way this was addressed in lldb was to do a fallback query - query with the unsimplified name if that's what's available, then try again with the simplified name (& prune the results based on the DIEs returned). That way both full and simplified names can be supported and mixed to some degree - so long as the index matches the content it's indexing (eg: an index entry "t1" should go to a DIE named "t1", an index entry "t1<int>" should go to a DIE named "t1<int>")