[PATCH] gdb: user variables with components of dynamic type

Thu Nov 12 16:00:18 GMT 2020

Joel,

Thanks for your feedback.

> * Joel Brobecker <brobecker@adacore.com> [2020-11-08 14:50:59 +0400]:
>
> Hi Andrew,
>
> > * Andrew Burgess <andrew.burgess@embecosm.com> [2020-10-22 16:32:38 +0100]:
> >
> > > Consider this Fortran type:
> > >
> > >   type :: some_type
> > >      integer, allocatable :: array_one (:,:)
> > >      integer :: a_field
> > >      integer, allocatable :: array_two (:,:)
> > >   end type some_type
> > >
> > > And a variable declared:
> > >
> > >   type(some_type) :: some_var
> > >
> > > Now within GDB we try this:
> > >
> > >   (gdb) set $a = some_var
> > >   (gdb) p $a
> > >   $1 = ( array_one =
> > >   ../../src/gdb/value.c:3968: internal-error: Unexpected lazy value type.
> > >
> > > Normally, when an internalvar ($a in this case) is created, it is
> > > non-lazy, the value is immediately copied out of the inferior into
> > > GDB's memory.
> > >
> > > When printing the internalvar ($a) GDB will extract each field in
> > > turn, so in this case `array_one`.  As the original internalvar is
> > > non-lazy then the extracted field will also be non-lazy, with its
> > > contents immediately copied from the parent internalvar.
> > >
> > > However, when the field has a dynamic type this is not the case,
> > > value_primitive_field we see that any field with dynamic type is
> > > always created lazy.  Further, the content of this field will usually
> > > not have been captured in the contents buffer of the original value, a
> > > field with dynamic location is effectively a pointer value contained
> > > within the parent value, with rules in the DWARF for how to
> > > dereference the pointer.
>
> Is it a pointer, or a reference? From what you are seeing and
> what you are reported here, I assume these components are declared
> as references? Or perhaps, after written 3 different versions of
> a reply to this email, they are actually *neither*, but rather
> are described as arrays with location expressions?

If we just look at 'some_var%array_one', here's it's type information:

 <1><3c>: Abbrev Number: 5 (DW_TAG_structure_type)
    <3d>   DW_AT_name        : (indirect string, offset: 0x0): some_type
    <41>   DW_AT_byte_size   : 184
    <42>   DW_AT_decl_file   : 1
    <43>   DW_AT_decl_line   : 16
    <44>   DW_AT_sibling     : <0x6d>
 <2><48>: Abbrev Number: 6 (DW_TAG_member)
    <49>   DW_AT_name        : (indirect string, offset: 0x5f): array_one
    <4d>   DW_AT_decl_file   : 1
    <4e>   DW_AT_decl_line   : 18
    <4f>   DW_AT_type        : <0x6d>
    <53>   DW_AT_data_member_location: 0
 <2><54>: Abbrev Number: 6 (DW_TAG_member)
    <55>   DW_AT_name        : (indirect string, offset: 0x79): a_field
    <59>   DW_AT_decl_file   : 1
    <5a>   DW_AT_decl_line   : 19
    <5b>   DW_AT_type        : <0xaa>
    <5f>   DW_AT_data_member_location: 88
 <2><60>: Abbrev Number: 6 (DW_TAG_member)
    <61>   DW_AT_name        : (indirect string, offset: 0x4c): array_two
    <65>   DW_AT_decl_file   : 1
    <66>   DW_AT_decl_line   : 20
    <67>   DW_AT_type        : <0xb6>
    <6b>   DW_AT_data_member_location: 96
 <2><6c>: Abbrev Number: 0
 <1><6d>: Abbrev Number: 7 (DW_TAG_array_type)
    <6e>   DW_AT_ordering    : 1        (column major)
    <6f>   DW_AT_data_location: 2 byte block: 97 6      (DW_OP_push_object_address; DW_OP_deref)
    <72>   DW_AT_allocated   : 4 byte block: 97 6 30 2e         (DW_OP_push_object_address; DW_OP_deref; DW_OP_lit0; DW_OP_ne)
    <77>   DW_AT_type        : <0xaa>		[ APB: This is signed 4-byte integer. ]
    <7b>   DW_AT_sibling     : <0xaa>		[ APB: This is signed 4-byte integer. ]
 <2><7f>: Abbrev Number: 8 (DW_TAG_subrange_type)
    <80>   DW_AT_lower_bound : 4 byte block: 97 23 30 6         (DW_OP_push_object_address; DW_OP_plus_uconst: 48; DW_OP_deref)
    <85>   DW_AT_upper_bound : 4 byte block: 97 23 38 6         (DW_OP_push_object_address; DW_OP_plus_uconst: 56; DW_OP_deref)
    <8a>   DW_AT_byte_stride : 9 byte block: 97 23 28 6 97 23 20 6 1e   (DW_OP_push_object_address; DW_OP_plus_uconst: 40; DW_OP_deref; DW_OP_push_object_address; DW_OP_plus_uconst: 32; DW_OP_deref; DW_OP_mul)
 <2><94>: Abbrev Number: 8 (DW_TAG_subrange_type)
    <95>   DW_AT_lower_bound : 4 byte block: 97 23 48 6         (DW_OP_push_object_address; DW_OP_plus_uconst: 72; DW_OP_deref)
    <9a>   DW_AT_upper_bound : 4 byte block: 97 23 50 6         (DW_OP_push_object_address; DW_OP_plus_uconst: 80; DW_OP_deref)
    <9f>   DW_AT_byte_stride : 9 byte block: 97 23 40 6 97 23 20 6 1e   (DW_OP_push_object_address; DW_OP_plus_uconst: 64; DW_OP_deref; DW_OP_push_object_address; DW_OP_plus_uconst: 32; DW_OP_deref; DW_OP_mul)
 <2><a9>: Abbrev Number: 0

So your third choice was the winner, the array has dynamic type and
includes a computed data location.

>
> > > So, we end up with a lazy lval_internalvar_component representing a
> > > field within an lval_internalvar.  This eventually ends up in
> > > value_fetch_lazy, which currently does not support
> > > lval_internalvar_component, and we see the error above.
> > >
> > > My original plan for how to handle this involved extending
> > > value_fetch_lazy to handle lval_internalvar_component.  However, when
> > > I did this I ran into another error:
> > >
> > >   (gdb) set $a = some_var
> > >   (gdb) p $a
> > >   $1 = ( array_one = ((1, 1) (1, 1) (1, 1)), a_field = 5, array_two = ((0, 0, 0) (0, 0, 0)) )
> > >   (gdb) p $a%array_one
> > >   $2 = ((1, 1) (1, 1) (1, 1))
> > >   (gdb) p $a%array_one(1,1)
> > >   ../../src/gdb/value.c:1547: internal-error: void set_value_address(value*, CORE_ADDR): Assertion `value->lval == lval_memory' failed.
>
> I am not surprised. Intuitively, like you said, we expect GDB
> to "capture" the value of our variable, so we should have anything
> lazy about it, or else this would indicate an incomplete capture.

Agreed.

>
> > In an ideal world (I think) GDB would be
> > > able to do this even for values with dynamic type.  So in our above
> > > example doing `set $a = some_var` would capture the content of
> > > 'some_var', but also the content of 'array_one', and also
> > > 'array_two', even these content regions are not contained within the
> > > region of 'some_var'.
>
> This would be my understanding as well, provided the arrays are
*> references*. For pointers, I think it's fine to continue with
> the idea that we capture the target address, but not the target
> memory region it points to.

Again, I think we agree.

The problem in terms of implementation is that really everything is
either a real inline value, or a pointer.  All the other words are
just language sugar on top of these two choices.

In C then things are dead simple, something is either a pointer or is
the actual contents of the value, but the language exposes all this to
the programmer, so there's little room for surprise.

When we look at C++ references (basically pointers + automatic
dereferencing), or Fortran allocatable variables (same again) things
are less clear, we capture the underlying pointer, but can (especially
for Fortran) display the value with automatic dereferencing.

You specifically asked about references, I'm taking this to mean C++
references.  Consider this test program:

  #include <cstdio>

  struct xxx
  {
    int &val;
  };

  void
  func (xxx x)
  {
    printf ("Got: %d\n", x.val);
    x.val = 0;
  }

  int
  main ()
  {
    int i = 3;
    xxx x = { i };
    func (x);		/* Break 1.  */
    printf ("Returning: %d\n", i);	/* Break 2.  */
    return i;
  }

Now our GDB session:

  Breakpoint 1, main () at ref.cc:20
  20	  func (x);
  (gdb) p x
  $1 = {
    val = @0x7fffffffb55c
  }
  (gdb) p x.val
  $2 = (int &) @0x7fffffffb55c: 3
  (gdb) set $foo = x
  (gdb) p $foo
  $3 = {
    val = @0x7fffffffb55c
  }
  (gdb) p $foo.val
  $4 = (int &) @0x7fffffffb55c: 3
  (gdb) next
  Got: 3
  21	  printf ("Returning: %d\n", i);
  (gdb) p $foo.val
  $5 = (int &) @0x7fffffffb55c: 0
  (gdb) p x.val
  $6 = (int &) @0x7fffffffb55c: 0
  (gdb)

So we get the behaviour we might expect, the pointer value underlying
the reference is preserved, but the value pointed too is not.
Interestingly the choice was made to not automatically dereference the
C++ references, so they are displayed in a semi-pointer fashion, the
type prefix and the '@' symbol being what tells them apart from
regular pointers.

>
> > > Supporting this would require GDB values to be able to carry around
> > > multiple non-contiguous regions of memory at content in some way,
> > > which sounds like a pretty huge change to a core part of GDB.
> > >
> > > So, I wondered if there was some other solution that wouldn't require
> > > such a huge change.
> > >
> > > What if values with a dynamic location were though of like points with
> > > automatic dereferencing?  Given this C structure:
> > >
> > >   struct foo_t {
> > >     int *val;
> > >   }
> > >
> > >   struct foo_t my_foo;
> > >
> > > Then in GDB:
> > >
> > >   (gdb) $a = my_foo
> > >
> > > We would expect GDB to capture the pointer value in '$a', but not the
> > > value pointed at by the pointer.  So maybe it's not that unreasonable
> > > to think that given a dynamically typed field GDB will capture the
> > > address of the content, but not the actual content itself.
> > >
> > > That's what this patch does.
>
> I admit I don't really understand quite how this is all happening,
> and how you're trying to deal with the issue.

I'm not sure which bit you don't understand, as in the next paragraph
you give an accurate description of what I'm proposing...

> It's possible that the compromise you suggest (treat dynamic components
> the same as pointers) might be the most reasonable way out, but I think
> it'll invite confusion on the users' side, and probably bug reports.
> At the very least, I think we should warn users when we do this, so
> as to be sure to set expectations right, on the spot.

Adding a warning would be reasonably simple, we can start with (in
value.c:set_internalvar):

  if (is_dynamic_type (value_type (new_data.value)))
    warning ("some warning text here...");

There's two problems, the first is easy enough to solve:  if the top
level value being captured is dynamic, then we do capture the _actual_
value, it's only when a sub-component is dynamic that we have
problems.  The above check will trigger if only the top-level value is
dynamic, so it warns in too many places.

As a concrete example, given this Fortran type:

  type :: some_type
     integer, allocatable :: array_one (:,:)
     integer :: a_field
     integer, allocatable :: array_two (:,:)
  end type some_type

  type(some_type) :: some_var

Then in GDB:

  (gdb) set $foo = some_var

We capture the contents of the some_type struct, including the
pointers to the dynamic objects array_one and array_two.  But if
instead we do:

  (gdb) set $bar = some_var%array_one

Now we capture the full contents of array_one, there's no further
dynamic type resolution required.  Changing 'some_var%array_one' will
not change the value of $bar, but the change would be see in $foo.

The harder problem is, what warning do we print??  I initially went
with:

  components of dynamically typed values are not currently captured within internal variables

despite being a bit long, it's not immediately clear if a user will
know what 'dynamically typed values' means?  Maybe we end up needing a
language specific warning, so for Fortran:

  the values of allocatable fields are not currently captured within internal variables

thoughts or suggestions are welcome...

>
> Have you looked at how we handle components which are references?
> I wonder how well we handle those...

As above we treat them as pointers, but guard against possible
confusion by displaying them as pointers.

I would not like to change Fortran from displaying dynamnic types as
their actual value (and instead just display a pointer) as that seems
like a really bad change just to work around a limitation with
internal variables.

What I think is super interesting is how this all interacts with
pretty-printers.  So, if I start with this test program:

  #include <vector>

  struct xxx
  {
    std::vector<int> lst;
  };

  static void
  update (xxx &x)
  {
    x.lst.clear ();
    x.lst.push_back (4);
    x.lst.push_back (5);
    x.lst.push_back (6);
  }

  int
  main ()
  {
    xxx x = { { 1, 2, 3 } };
    update (x);
    return 0;
  }

Then this is my GDB session (making use of C++ pretty-printers):

  Temporary breakpoint 1, main () at lst.cc:20
  20	  xxx x = { { 1, 2, 3 } };
  (gdb) n
  21	  update (x);
  (gdb) set $foo=x
  (gdb) p $foo
  $1 = {
    lst = std::vector of length 3, capacity 3 = {1, 2, 3}
  }
  (gdb) n
  22	  return 0;
  (gdb) p $foo
  $2 = {
    lst = std::vector of length 3, capacity 3 = {4, 5, 6}
  }

Notice that the contents of the std::vector changed in the variable
$foo.  This I think is the closest to the Fortran case.  For Fortran
GDB itself is providing the pretty-printing (it prints the dynamic
value rather than just displaying a pointer), and like with the
std::vector case above, the actual value is not captured, but just
printed.

I wonder if this problem should just be solved (at least in the
short/medium term) by improving the documentation for internal
variables?

Thanks,
Andrew