Bug 32160 - [gdb/symtab] Parent map: die parent or scope parent?
Summary: [gdb/symtab] Parent map: die parent or scope parent?
Status: RESOLVED FIXED
Alias: None
Product: gdb
Classification: Unclassified
Component: symtab (show other bugs)
Version: HEAD
: P2 normal
Target Milestone: 15.2
Assignee: Not yet assigned to anyone
URL:
Keywords:
Depends on:
Blocks: 29366
  Show dependency treegraph
 
Reported: 2024-09-10 15:17 UTC by Tom de Vries
Modified: 2024-09-15 15:13 UTC (History)
2 users (show)

See Also:
Host:
Target:
Build:
Last reconfirmed:


Attachments
tentative patch (1.47 KB, patch)
2024-09-11 07:41 UTC, Tom de Vries
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Tom de Vries 2024-09-10 15:17:57 UTC
We have commit:
...
commit 4e417d7bb1c7d8b8a73b73b0b7c051f0a6e1b6b6
Author: Tom Tromey <tom@tromey.com>
Date:   Fri Jan 12 18:01:00 2024 -0700

    Change handling of DW_TAG_enumeration_type in DWARF scanner
    
    Currently the DWARF scanner will enter enumeration constants into the
    same namespace as the DW_TAG_enumeration_type itself.  This is the
    right thing to do, but the implementation may result in strange
    entries being added to the addrmap that maps DIE ranges to entries.
    
    This came up when debugging an earlier version of this series; and
    while I don't think this should impact the current series, it seems
    better to clean this up anyway.
    
    In the new code, rather than pass the "wrong" scope down through
    recursive calls to the scanner, the correct scope is always passed,
    and then the parent handling is done when creating the enumerator
    entry.

diff --git a/gdb/dwarf2/read.c b/gdb/dwarf2/read.c
index 4c97fe12307..b7e04a1ebf3 100644
--- a/gdb/dwarf2/read.c
+++ b/gdb/dwarf2/read.c
@@ -16468,6 +16468,12 @@ cooked_indexer::index_dies (cutu_reader *reader,
          flags &= ~IS_STATIC;
          flags |= parent_entry->flags & IS_STATIC;
        }
+      /* If the parent is an enum, but not an enum class, then use the
+        grandparent instead.  */
+      if (this_parent_entry != nullptr
+         && this_parent_entry->tag == DW_TAG_enumeration_type
+         && !is_enum_class)
+       this_parent_entry = this_parent_entry->get_parent ();
 
       if (abbrev->tag == DW_TAG_namespace
          && m_language == language_cplus
@@ -16530,15 +16536,7 @@ cooked_indexer::index_dies (cutu_reader *reader,
              break;
 
            case DW_TAG_enumeration_type:
-             /* We need to recurse even for an anonymous enumeration.
-                Which scope we record as the parent scope depends on
-                whether we're reading an "enum class".  If so, we use
-                the enum itself as the parent, yielding names like
-                "enum_class::enumerator"; otherwise we inject the
-                names into our own parent scope.  */
-             info_ptr = recurse (reader, info_ptr,
-                                 is_enum_class ? this_entry : parent_entry,
-                                 fully);
+             info_ptr = recurse (reader, info_ptr, this_entry, fully);
              continue;
 
            case DW_TAG_module:
...

The commit caused two regressions, PR31900 and PR32159.

A way to address these is to re-implement the commit, which requires a more precise formulation of the problem that the commit is trying to address, so let's reserve this PR for that.

Now consider current trunk, and the test-case:
...
$ cat test.c
namespace ns
{
  enum e
  {
    val1 = 0x1,
  };
}

int
main (void)
{
  return (int)ns::val1;
}
$ g++ test.c -g
$ ./a.out; echo $?
1
...
with debug info:
...
 <1><d3>: Abbrev Number: 2 (DW_TAG_namespace)
    <d4>   DW_AT_name        : ns
    <d7>   DW_AT_decl_file   : 1
    <d8>   DW_AT_decl_line   : 1
    <d9>   DW_AT_sibling     : <0xf0>
 <2><dd>: Abbrev Number: 3 (DW_TAG_enumeration_type)
    <de>   DW_AT_name        : e
    <e0>   DW_AT_encoding    : 7        (unsigned)
    <e1>   DW_AT_byte_size   : 4
    <e2>   DW_AT_type        : <0xf0>
    <e6>   DW_AT_decl_file   : 1
    <e7>   DW_AT_decl_line   : 3
 <3><e8>: Abbrev Number: 4 (DW_TAG_enumerator)
    <e9>   DW_AT_name        : (indirect string, offset: 0x1ea): val1
    <ed>   DW_AT_const_value : 1
 <3><ee>: Abbrev Number: 0
 <2><ef>: Abbrev Number: 0
...

After adding these debug prints:
...
diff --git a/gdb/dwarf2/parent-map.h b/gdb/dwarf2/parent-map.h
index 5307d4df45b..bfb00d310ae 100644
--- a/gdb/dwarf2/parent-map.h
+++ b/gdb/dwarf2/parent-map.h
@@ -83,12 +83,16 @@ class parent_map
     return addr_type (value);
   }
 
+  void dump_entry (addr_type start, addr_type end,
+                  const cooked_index_entry *parent);
+
   /* Add a new entry to this map.  DIEs from START to END, inclusive,
      are mapped to PARENT.  */
   void add_entry (addr_type start, addr_type end,
                  const cooked_index_entry *parent)
   {
     gdb_assert (parent != nullptr);
+    dump_entry (start, end, parent);
     m_map.set_empty (start, end, (void *) parent);
   }
 
diff --git a/gdb/dwarf2/read.c b/gdb/dwarf2/read.c
index 9ae47188d88..fd3d5a0484e 100644
--- a/gdb/dwarf2/read.c
+++ b/gdb/dwarf2/read.c
@@ -16390,6 +16390,16 @@ cooked_indexer::index_imported_unit (cutu_reader *reader,
   return info_ptr;
 }
 
+void
+parent_map::dump_entry (addr_type start, addr_type end,
+                       const cooked_index_entry *parent)
+{
+  fprintf (stderr, "[0x%llx - 0x%llx]: 0x%llx\n",
+          (unsigned long long)start,
+          (unsigned long long)end,
+          (unsigned long long)parent->die_offset);
+}
+
 const gdb_byte *
 cooked_indexer::recurse (cutu_reader *reader,
                         const gdb_byte *info_ptr,
...
we get:
...
$ gdb -q -batch a.out
[0xde - 0xee]: 0xdd
[0xd4 - 0xef]: 0xd3
...

Let's look at what that means.

The first line, "[0xde - 0xee]: 0xdd" declares that any DIE in the range [0xde - 0xee]  has parent entry with die offset 0xdd: enumeration type e.

The second line, "[0xd4 - 0xef]: 0xd3" declares that any DIE in the range [0xd4 - 0xef] that hasn't been set by the previous line has parent entry with die offset 0xd3: namespace ns.

The result of this is that the parent map establishes the parent relationships:
- ns <- e
- e  <- val1.

Now lets revert the commit:
...
$ git revert 4e417d7bb1c7
...
and redo the gdb run:
...
$ gdb -q -batch a.out
[0xd4 - 0xee]: 0xd3
[0xd4 - 0xef]: 0xd3
...

This instead establishes the parent relationships:
- ns <- e
- ns <- val1

It's good to note at this point that the entry for val1 has ns as parent:
...
    [8] ((cooked_index_entry *) 0x7f091c006390)
    name:       val1
    canonical:  val1
    qualified:  ns::val1
    DWARF tag:  DW_TAG_enumerator
    flags:      0x0 []
    DIE offset: 0xe8
    parent:     ((cooked_index_entry *) 0x7f091c006330) [ns]
...
both with and without the patch reverted.

So, this is really about what a parent map is.

Is it a DIE parent map, or is a scope parent map?

If it is a DIE parent map, you want "e <- val1".

If it is a scope parent map, you want "ns <- val1".

The answer I think should lie in how it is used, and my understanding is that it is used to find parent entries of entries with deferred parents.

Given that the parent of entries seem to be scope parents, it would make sense if we'd store scope parents as well in the parent map.

So, my current thinking is that this describes another regression by the commit.

This regression will only become visible if we manage to construct a test-case where we lookup the parent for val1.
Comment 1 Tom de Vries 2024-09-10 15:30:35 UTC
I attempted to write a test-case to trigger this:
...
diff --git a/gdb/testsuite/gdb.dwarf2/enum-type.exp b/gdb/testsuite/gdb.dwarf2/enum-type.exp
index 394d287738b..76e4b552ef0 100644
--- a/gdb/testsuite/gdb.dwarf2/enum-type.exp
+++ b/gdb/testsuite/gdb.dwarf2/enum-type.exp
@@ -30,7 +30,7 @@ Dwarf::assemble $asm_file {
                 {DW_AT_name     $srcfile}
                 {DW_AT_comp_dir /tmp}
         } {
-           declare_labels integer_label uinteger_label
+           declare_labels integer_label uinteger_label forward
 
             integer_label: DW_TAG_base_type {
                 {DW_AT_byte_size 4 DW_FORM_sdata}
@@ -63,6 +63,25 @@ Dwarf::assemble $asm_file {
                    {DW_AT_const_value 2 DW_FORM_sdata}
                }
            }
+
+           DW_TAG_enumeration_type {
+               {DW_AT_specification :$forward}
+           } {
+               DW_TAG_enumerator {
+                   {DW_AT_name val1}
+                   {DW_AT_const_value 1 DW_FORM_sdata}
+               }
+           }
+
+           DW_TAG_namespace {
+               {DW_AT_name ns}
+           } {
+               forward: DW_TAG_enumeration_type {
+                   {DW_AT_name e}
+                   {DW_AT_type :$uinteger_label}
+                   {DW_AT_declaration 1 flag}
+               }
+           }
        }
     }
 }
...
and with lldb I get:
...
$ lldb -batch outputs/gdb.dwarf2/enum-type/enum-type -o "p ns::e::val1"
(lldb) target create "outputs/gdb.dwarf2/enum-type/enum-type"
Current executable set to '/home/vries/gdb/outputs/gdb.dwarf2/enum-type/enum-type' (x86_64).
(lldb) p ns::e::val1
(ns::e) $0 = val1
...
and with gdb:
...
$ gdb -q -batch outputs/gdb.dwarf2/enum-type/enum-type
src/gdb/dwarf2/cooked-index.h:214: internal-error: get_parent: Assertion `(flags & IS_PARENT_DEFERRED) == 0' failed.
A problem internal to GDB has been detected,
further debugging may prove unreliable.
...

Bypassing the cooked index avoids the segfault:
...
$ gdb -q -batch -readnow outputs/gdb.dwarf2/enum-type/enum-type -ex "print ns::val1"
No symbol "val1" in namespace "ns".
...
but doesn't result in a scoped symbol, instead we have:
...
$ gdb -q -batch -readnow outputs/gdb.dwarf2/enum-type/enum-type -ex "print val1"
$1 = val1
...

Reverting commit 4e417d7bb1c7 fixes the segfault, but produces a parentless entry:
...
    [11] ((cooked_index_entry *) 0x7fb254002fe0)
    name:       val1
    canonical:  val1
    qualified:  val1
    DWARF tag:  DW_TAG_enumerator
    flags:      0x2 [IS_STATIC]
    DIE offset: 0x104
    parent:     ((cooked_index_entry *) 0)
...
Comment 2 Tom de Vries 2024-09-11 06:15:22 UTC
(In reply to Tom de Vries from comment #1)
> Bypassing the cooked index avoids the segfault:
> ...
> $ gdb -q -batch -readnow outputs/gdb.dwarf2/enum-type/enum-type -ex "print
> ns::val1"
> No symbol "val1" in namespace "ns".
> ...
> but doesn't result in a scoped symbol, instead we have:
> ...
> $ gdb -q -batch -readnow outputs/gdb.dwarf2/enum-type/enum-type -ex "print
> val1"
> $1 = val1
> ...

Ok, that's a test-case error, needed to set the CU language to c++.

After fixing that, we have:
...
$ gdb -q -batch -readnow outputs/gdb.dwarf2/enum-type/enum-type -ex "print ns::val1" 
$1 = ns::val1
...
Comment 3 Tom de Vries 2024-09-11 06:44:53 UTC
(In reply to Tom de Vries from comment #1)
> Reverting commit 4e417d7bb1c7 fixes the segfault, but produces a parentless
> entry:
> ...
>     [11] ((cooked_index_entry *) 0x7fb254002fe0)
>     name:       val1
>     canonical:  val1
>     qualified:  val1
>     DWARF tag:  DW_TAG_enumerator
>     flags:      0x2 [IS_STATIC]
>     DIE offset: 0x104
>     parent:     ((cooked_index_entry *) 0)
> ...

The problem originates here in passing the parent_entry:
...
              info_ptr = recurse (reader, info_ptr,
                                  is_enum_class ? this_entry : parent_entry,
                                  fully);
...
when handling die 0xff:
...
 <1><ff>: Abbrev Number: 8 (DW_TAG_enumeration_type)
    <100>   DW_AT_specification: <0x110>
 <2><104>: Abbrev Number: 9 (DW_TAG_enumerator)
    <105>   DW_AT_name        : val1
    <10a>   DW_AT_const_value : 1
 <2><10b>: Abbrev Number: 0
 <1><10c>: Abbrev Number: 10 (DW_TAG_namespace)
    <10d>   DW_AT_name        : ns
 <2><110>: Abbrev Number: 11 (DW_TAG_enumeration_type)
    <111>   DW_AT_name        : e
    <113>   DW_AT_type        : <0xd2>
    <117>   DW_AT_declaration : 1
 <2><118>: Abbrev Number: 0
 <1><119>: Abbrev Number: 0
...

Variable is_enum_class is false, so we use parent_entry, which is nullptr.

But this_entry is:
...
$13 = {<allocate_on_obstack<cooked_index_entry>> = {<No data fields>},
  name = 0x2e26751 "e", canonical = 0x0, tag = DW_TAG_enumeration_type,
  flags = {m_enum_value = IS_PARENT_DEFERRED}, lang = language_cplus,
  die_offset = (unknown: 0xff), per_cu = 0x7fffe8002e60, m_parent_entry = {
    resolved = 0x110, deferred = (unknown: 0x110)}}
...
and we want the parent of this_entry, which is not nullptr, but deferred.
Comment 4 Tom de Vries 2024-09-11 07:41:53 UTC
Created attachment 15699 [details]
tentative patch

This otherwise untested patch gets us:
...
    [11] ((cooked_index_entry *) 0x7fcd54002fe0)
    name:       val1
    canonical:  val1
    qualified:  ns::val1
    DWARF tag:  DW_TAG_enumerator
    flags:      0x0 []
    DIE offset: 0x104
    parent:     ((cooked_index_entry *) 0x7fcd54003010) [ns]
...
Comment 5 Tom de Vries 2024-09-12 11:57:40 UTC
(In reply to Tom de Vries from comment #1)
> Bypassing the cooked index avoids the segfault:

And I've submitted a patch doing that as fix ( https://sourceware.org/pipermail/gdb-patches/2024-September/211683.html ).
Comment 6 Tom de Vries 2024-09-12 11:59:20 UTC
(In reply to Tom de Vries from comment #4)
> Created attachment 15699 [details]
> tentative patch
> 
> This otherwise untested patch gets us:
> ...
>     [11] ((cooked_index_entry *) 0x7fcd54002fe0)
>     name:       val1
>     canonical:  val1
>     qualified:  ns::val1
>     DWARF tag:  DW_TAG_enumerator
>     flags:      0x0 []
>     DIE offset: 0x104
>     parent:     ((cooked_index_entry *) 0x7fcd54003010) [ns]
> ...

Submitted here ( https://sourceware.org/pipermail/gdb-patches/2024-September/211685.html ).
Comment 7 Sourceware Commits 2024-09-14 12:10:08 UTC
The master branch has been updated by Tom de Vries <vries@sourceware.org>:

https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;h=a2860473ef13cfebcd9fddc067b7b36ca56c6b81

commit a2860473ef13cfebcd9fddc067b7b36ca56c6b81
Author: Tom de Vries <tdevries@suse.de>
Date:   Sat Sep 14 14:09:35 2024 +0200

    [gdb/symtab] Revert "Change handling of DW_TAG_enumeration_type in DWARF scanner"
    
    After adding dwarf assembly to test-case gdb.dwarf2/enum-type.exp that adds
    this debug info:
    ...
     <1><11f>: Abbrev Number: 3 (DW_TAG_enumeration_type)
        <120>   DW_AT_specification: <0x130>
     <2><124>: Abbrev Number: 4 (DW_TAG_enumerator)
        <125>   DW_AT_name        : val1
        <12a>   DW_AT_const_value : 1
     <2><12b>: Abbrev Number: 0
     <1><12c>: Abbrev Number: 5 (DW_TAG_namespace)
        <12d>   DW_AT_name        : ns
     <2><130>: Abbrev Number: 6 (DW_TAG_enumeration_type)
        <131>   DW_AT_name        : e
        <133>   DW_AT_type        : <0x118>
        <137>   DW_AT_declaration : 1
    ...
    I run into an assertion failure:
    ...
    (gdb) file enum-type^M
    Reading symbols from enum-type...^M
    cooked-index.h:214: internal-error: get_parent: \
      Assertion `(flags & IS_PARENT_DEFERRED) == 0' failed.^M
    ...
    
    This was reported in PR32160 comment 1.
    
    This is a regression since commit 4e417d7bb1c ("Change handling of
    DW_TAG_enumeration_type in DWARF scanner").
    
    Fix this by reverting the commit.
    
    [ Also drop the kfails for PR31900 and PR32158, which are regressions by that
    same commit. ]
    
    That allows us to look at the output of "maint print objfiles", and for val1
    we get an entry without parent:
    ...
        [27] ((cooked_index_entry *) 0x7fbbb4002ef0)
        name:       val1
        canonical:  val1
        qualified:  val1
        DWARF tag:  DW_TAG_enumerator
        flags:      0x0 []
        DIE offset: 0x124
        parent:     ((cooked_index_entry *) 0)
    ...
    which is incorrect, as noted in that same comment, but an improvement over the
    assertion failure, and I don't think that ever worked.  This is to be
    addressed in a follow-up patch.
    
    Reverting the commit begs the question: what was it trying to fix in the first
    place, and do we need a different fix?  I've investigated this and filed
    PR32160 to track this.
    
    My guess is that the commit was based on a misunderstand of what we track
    in cooked_indexer::m_die_range_map.
    
    Each DIE has two types of parent DIEs:
    - a DIE that is the parent as indicated by the tree structure in which DIEs
      occur, and
    - a DIE that represent the parent scope.
    
    In most cases, these two are the same, but some times they're not.
    
    The debug info above demonstrates such a case.  The DIE at 0x11f:
    - has a tree-parent: the DIE representing the CU, and
    - has a scope-parent: DIE 0x12c representing namespace ns.
    
    In cooked_indexer::m_die_range_map, we track scope-parents, and the commit
    tried to add a tree-parent instead.
    
    So, I don't think we need a different fix, and propose we backport the reversal
    for gdb 15.2.
    
    Tested on x86_64-linux.
    
    Approved-By: Tom Tromey <tom@tromey.com>
    
    Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=31900
    Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=32158
    Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=32160
Comment 8 Sourceware Commits 2024-09-15 13:31:22 UTC
The gdb-15-branch branch has been updated by Tom de Vries <vries@sourceware.org>:

https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;h=21528df40e76ade452960d0325d86803e861ec3b

commit 21528df40e76ade452960d0325d86803e861ec3b
Author: Tom de Vries <tdevries@suse.de>
Date:   Sun Sep 15 15:30:53 2024 +0200

    [gdb/symtab] Revert "Change handling of DW_TAG_enumeration_type in DWARF scanner"
    
    After adding dwarf assembly to test-case gdb.dwarf2/enum-type.exp that adds
    this debug info:
    ...
     <1><11f>: Abbrev Number: 3 (DW_TAG_enumeration_type)
        <120>   DW_AT_specification: <0x130>
     <2><124>: Abbrev Number: 4 (DW_TAG_enumerator)
        <125>   DW_AT_name        : val1
        <12a>   DW_AT_const_value : 1
     <2><12b>: Abbrev Number: 0
     <1><12c>: Abbrev Number: 5 (DW_TAG_namespace)
        <12d>   DW_AT_name        : ns
     <2><130>: Abbrev Number: 6 (DW_TAG_enumeration_type)
        <131>   DW_AT_name        : e
        <133>   DW_AT_type        : <0x118>
        <137>   DW_AT_declaration : 1
    ...
    I run into an assertion failure:
    ...
    (gdb) file enum-type^M
    Reading symbols from enum-type...^M
    cooked-index.h:214: internal-error: get_parent: \
      Assertion `(flags & IS_PARENT_DEFERRED) == 0' failed.^M
    ...
    
    This was reported in PR32160 comment 1.
    
    This is a regression since commit 4e417d7bb1c ("Change handling of
    DW_TAG_enumeration_type in DWARF scanner").
    
    Fix this by reverting the commit.
    
    [ Also drop the kfails for PR31900 and PR32158, which are regressions by that
    same commit. ]
    
    That allows us to look at the output of "maint print objfiles", and for val1
    we get an entry without parent:
    ...
        [27] ((cooked_index_entry *) 0x7fbbb4002ef0)
        name:       val1
        canonical:  val1
        qualified:  val1
        DWARF tag:  DW_TAG_enumerator
        flags:      0x0 []
        DIE offset: 0x124
        parent:     ((cooked_index_entry *) 0)
    ...
    which is incorrect, as noted in that same comment, but an improvement over the
    assertion failure, and I don't think that ever worked.  This is to be
    addressed in a follow-up patch.
    
    Reverting the commit begs the question: what was it trying to fix in the first
    place, and do we need a different fix?  I've investigated this and filed
    PR32160 to track this.
    
    My guess is that the commit was based on a misunderstand of what we track
    in cooked_indexer::m_die_range_map.
    
    Each DIE has two types of parent DIEs:
    - a DIE that is the parent as indicated by the tree structure in which DIEs
      occur, and
    - a DIE that represent the parent scope.
    
    In most cases, these two are the same, but some times they're not.
    
    The debug info above demonstrates such a case.  The DIE at 0x11f:
    - has a tree-parent: the DIE representing the CU, and
    - has a scope-parent: DIE 0x12c representing namespace ns.
    
    In cooked_indexer::m_die_range_map, we track scope-parents, and the commit
    tried to add a tree-parent instead.
    
    So, I don't think we need a different fix, and propose we backport the reversal
    for gdb 15.2.
    
    Tested on x86_64-linux.
    
    Approved-By: Tom Tromey <tom@tromey.com>
    
    Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=31900
    Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=32158
    Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=32160
    (cherry picked from commit a2860473ef13cfebcd9fddc067b7b36ca56c6b81)
Comment 9 Tom de Vries 2024-09-15 15:13:11 UTC
The assertion failure reported here has been fixed on trunk and gdb-15-branch.