This is the mail archive of the gdb-patches@sourceware.org mailing list for the GDB project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [Dwarf-Discuss] DWARF: Hierarchies of abstract and concrete DIE instance trees


On 11/22/2017 12:56 PM, Simon Marchi wrote:
On 2017-09-15 08:02 AM, Simon Marchi wrote:
Hi all,

First, a bit of context.  I'm currently investigating why GDB doesn't show
optimized out variables in "info locals".  The fix appears simple [1], but one
test case starts failing.  The symptom is that the local variable of func1, which
is inlined in main, appears twice (at least when compiled with GCC 5.4 and 7.1):

(gdb) bt
#0  bar () at /home/emaisin/src/binutils-gdb/gdb/testsuite/gdb.opt/inline-markers.c:27
#1  0x00000000004005b0 in func1 (arg1=0) at /home/emaisin/src/binutils-gdb/gdb/testsuite/gdb.opt/inline-locals.c:38
#2  main () at /home/emaisin/src/binutils-gdb/gdb/testsuite/gdb.opt/inline-locals.c:55
(gdb) frame
#1  0x00000000004005b0 in func1 (arg1=0) at /home/emaisin/src/binutils-gdb/gdb/testsuite/gdb.opt/inline-locals.c:38
(gdb) info locals
array = {... the right value ... }
array = <optimized out>

Note that in this case, array is not actually optimized out.  What I'm seeing
when debugging GDB is that it creates two "array" symbols while parsing the
DWARF debug info.  The interesting parts of the DWARF are:

- the abstract instance tree (thing that gets inlined)

  <1><29>: Abbrev Number: 2 (DW_TAG_subprogram)
     <2a>   DW_AT_external    : 1
     <2a>   DW_AT_name        : (indirect string, offset: 0x70): func1
     <2e>   DW_AT_decl_file   : 1
     <2f>   DW_AT_decl_line   : 32
     <30>   DW_AT_prototyped  : 1
     <30>   DW_AT_type        : <0x50>
     <34>   DW_AT_inline      : 3        (declared as inline and inlined)
     <35>   DW_AT_sibling     : <0x50>
  ...
  <2><44>: Abbrev Number: 4 (DW_TAG_variable)
     <45>   DW_AT_name        : (indirect string, offset: 0x50): array
     <49>   DW_AT_decl_file   : 1
     <4a>   DW_AT_decl_line   : 34
     <4b>   DW_AT_type        : <0x57>

- the concrete instance tree (the place where it gets inlined)

  <2><9e>: Abbrev Number: 11 (DW_TAG_inlined_subroutine)
     <9f>   DW_AT_abstract_origin: <0x29>
     <a3>   DW_AT_low_pc      : 0x400585
     <ab>   DW_AT_high_pc     : 0x4c
     <b3>   DW_AT_call_file   : 1
     <b4>   DW_AT_call_line   : 55
  ...
  <3><be>: Abbrev Number: 13 (DW_TAG_lexical_block)
     <bf>   DW_AT_low_pc      : 0x400585
     <c7>   DW_AT_high_pc     : 0x4c
  <4><cf>: Abbrev Number: 14 (DW_TAG_variable)
     <d0>   DW_AT_abstract_origin: <0x44>
     <d4>   DW_AT_location    : 3 byte block: 91 e0 7d   (DW_OP_fbreg: -288)


The interesting thing here is that the hierarchies of the abstract and concrete
DIE trees are not exactly the same.  In the abstract tree the variable is a
direct child of the subprogram.  In the concrete tree, a lexical block is
inserted between the inlined subroutine and the variable.  For the visuals:

                                             child of
Abstract tree:  subprogram <------------------------------------- variable
                     A                                                A
                     |                                                |
         instance of |                                    instance of |
                     |                                                |
                     |         child of                  child of     |
Concrete tree:  inlined sub <---------- lexical block <---------- variable

When GDB parses that DWARF, it first creates a symbol when visiting the
concrete instance of the variable.  It then tries to inherit everything from
the inlined subroutine's abstract origin (including children) that wasn't
already explicitly referenced by its children.  For some reason, GDB doesn't
track properly that the abstract variable has already referenced, and creates a
second symbol.  That symbol coming from the abstract variable DIE doesn't have
a location, that's it appears as "optimized out".

My question is: should the hierarchies of the abstract and concrete trees match
exactly (IOW, should the concrete child's abstract origin's parent always be
the same as its parent's abstract origin).  And therefore, is this example
"legal" DWARF and GDB bug, or invalid DWARF and a GCC bug (or neither and I'm
completely lost).

With Clang, the lexical block is not there, and everything works as expected.

Thanks!

Simon

[1] https://github.com/simark/binutils-gdb/commit/3d16834e2d886d2dd57f93d27b39a4099ffc98fc.patch


Hi,

I had an email discussion some time ago with some gcc developers (Nathan Sidwell,
Richard Biener, Jason Merill).  I though I would post the important parts here,
to act as a reference.

Richard said:

I think the lexical block is just the function scope itself and the inliner
inserts this BLOCK which then corresponds to the DW_TAG_inlined_subroutine.
I suppose we should avoid emitting that BLOCK itself as a DW_TAG_lexical_block
but use the emitted DW_TAG_inlined_subroutine for that.

Not sure if I remember the details correctly.

I don't think the DWARF is invalid btw, with early LTO debug we have plenty of
abstract origins where source and destination context don't match 1:1.  We're
just using it as a "get some more info from this DIE" link which I think is
all that is documented as semantics (though the 'inline' term pops up too
often there and the relation to DW_AT_specification is unclear to me though
the latter is restricted to DW_TAG_subroutine AFAIR).

Jason said (replying to Richard):

I think the lexical block is just the function scope itself and the inliner
inserts this BLOCK which then corresponds to the DW_TAG_inlined_subroutine.
I suppose we should avoid emitting that BLOCK itself as a DW_TAG_lexical_block
but use the emitted DW_TAG_inlined_subroutine for that.

Agreed.  It's curious that we would generate the lexical block in the
inlined instance and not the abstract.

  I don't think the DWARF is invalid btw, with early LTO debug we have plenty of
  abstract origins where source and destination context don't match 1:1.  We're
  just using it as a "get some more info from this DIE" link which I think is
  all that is documented as semantics (though the 'inline' term pops up too
  often there and the relation to DW_AT_specification is unclear to me though
  the latter is restricted to DW_TAG_subroutine AFAIR).

Also agreed, GDB ought to be able to handle this situation.

So, bugs on both sides...

So the conclusion seems to be that the lexical block might be useless, but it
doesn't make the DWARF invalid, and GDB should be able to cope with it.

I think that the DWARF is invalid.

In the abstract instance, the DWARF describes the source as

   int func1 (...)
   {
      int array[];
      ...
   }

The concrete instance describes a different source:

   int func1 (...)
   {
     {
       int array[];
       ...
     }
   }

These are not the same source trees and the second one does not in fact accurately describe the actual source. I don't recall if there is a requirement in the DWARF Spec that the concrete and abstract instance descriptions need to match each other. If not, I think that there should be.

I'd like to see an example (as Jason mentioned) where abstract and concrete instances do not match and where this appears to be correct. If LLVM is using abstract origin to mean "get info from this DIE", and not to mean "this is a concrete instance of this abstract definition", then we (on the DWARF side) need to look at how abstract origin is defined and whether this use is compatible with its intended use.


--
Michael Eager    eager@eagercon.com
1960 Park Blvd., Palo Alto, CA 94306


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]