Bug 19893

Summary: Synthetic pointers created from C++ references are broken
Product: gdb Reporter: Martin Galvan <martingalvan>
Component: c++Assignee: Martin Galvan <martingalvan>
Status: RESOLVED FIXED    
Severity: normal CC: danielgutson, jan, luis.machado, martingalvan, pedro, tromey
Priority: P2    
Version: HEAD   
Target Milestone: ---   
Host: Target:
Build: Last reconfirmed:
Attachments: Example program that triggers the bug (compile with gcc -O3 and try to print 'arg')

Description Martin Galvan 2016-03-31 13:37:38 UTC
Created attachment 9147 [details]
Example program that triggers the bug (compile with gcc -O3 and try to print 'arg')

Hi everyone,

After compiling a program which uses C++ references with gcc 4.8.4 or later, some optimizations may convert the references into synthetic pointers. This seems to cause all sorts of breakage when trying to use them in gdb expressions. 

Suppose we have something like:

int var = 42;
int& ref = var;

Case 1: Trying to print the address of a ref causes gdb to crash:

(gdb) print &ref
/build/buildd/gdb-7.7.1/gdb/dwarf2loc.c:1624: internal-error: Should not be able to create a lazy value with an enclosing type
A problem internal to GDB has been detected,
further debugging may prove unreliable.

The expected result here would be the address of 'var'.

Case 2: Trying to print the value of a ref simply shows the <synthetic pointer> tag:

(gdb) print ref
$1 = (int &) <synthetic pointer>

The expected result would be something like:

$1 = (int &) <synthetic pointer>: 42

Case 3: Trying to do arithmetic on the value of a ref shows the following error:

(gdb) print ref + 0
Cannot access memory at address 0x0

The expected result would be to print '42' (the value of 'var').

Case 4: Trying to assign some value to the ref shows the same error:

(gdb) set (ref = 1)
Cannot access memory at address 0x0

There are probably other cases like these lying around.

I'm currently working on fixing this issue. I had originally sent a patch for case 1 (https://sourceware.org/ml/gdb-patches/2015-06/msg00278.html), but after discussing it with Pedro I decided to dwell a bit further and see the what's the real source of all the breakage.

I think the fundamental problem here is that gdb seems to always assume that references are implemented as pointers by the compiler, and thus the 'location' attribute of a value representing a ref always points to the underlying pointer. This is a mistake, since the C++ standard doesn't specify this. When references are effectively treated like aliases (which isn't just an optimization thing but a perfectly normal thing to do for any compiler), breakage arises.

From what I saw I think there may be two ways to fix this:

1) Looking for all the cases where synthetic references are a problem, and fixing them one by one. I think case 1 could be fixed by conditionally calling something like coerce_ref inside value_addr, though coerce_ref doesn't seem to work as expected (as evidenced by case 4). I haven't dwelled in the code for all the cases, but hopefully gdb uses something like coerce_ref for most of them, so hopefully the fix shouldn't touch a lot of code.
2) Acknowledging that references aren't pointers, and refactor all the code so they're treated as a special entity of some sort. IMHO this would be the right thing to do, but it would probably take way longer.

In any case, I don't think gcc should be touched. It's up to the compiler to decide how to treat references (and whether to mark them with DW_OP_GNU_implicit_pointer when optimized away), but the debugger should be able to handle them transparently in all cases.

What do you guys think? I'm probably missing something since I've come back to work on this only recently, so any feedback is more than appreciated.
Comment 1 Martin Galvan 2016-03-31 21:35:50 UTC
I've been looking a bit more into this and I think the crash from case 1 comes from a different issue. It seems that value_addr should also set arg2's enclosing_type to arg1's (as a pointer instead of a ref). I'm working on that patch (and a corresponding unit test) now; I think this can go in now before I keep working on fixing the other cases.
Comment 2 Martin Galvan 2016-04-05 20:24:17 UTC
I've sent a patch for fixing the crash when doing 'print &ref', here:

https://sourceware.org/ml/gdb-patches/2016-04/msg00012.html

As for the other cases, I've traced the root cause to pieced_value_funcs.coerce_ref being NULL after commit a471c5941e127823b95893176c7c9301c4b4cb32. I'm thinking of implementing this function for synthetic references only, so that calls to coerce_ref_if_computed will stop returning NULL. I'm open to suggestions or feedback of any kind though.
Comment 3 Martin Galvan 2016-04-08 18:41:14 UTC
Hi everyone,

I've fixed cases 2 and 3 by implementing pieced_value_funcs.coerce_ref as a wrapper for indirect_pieced_value that works only for synthetic references. I changed the check in indirect_pieced_value so that it works with both TYPE_CODE_PTR and TYPE_CODE_REF, since gdb always assumes they're both pointers.

While that seems to work, there's an issue: right now pieced_value_funcs.coerce_ref takes a const value which represents the reference, while indirect_pieced_value's argument isn't const. Dwelling deeper I saw that this is correct: indirect_pieced_value needs to fetch the value's contents (i.e. the pointer's value), which ends up calling value_fetch_lazy.
However, this breaks my approach unless I de-const the value all the way up from pieced_value_funcs.coerce_ref (which is ugly, especially when done on printing functions).

How could I go about doing this differently? Is there something I'm missing here?
Comment 4 Martin Galvan 2016-04-19 21:13:20 UTC
Some progress: I've written a couple of unit tests that use a synthetic reference to a physical variable. This allows to test assignment through the synthetic variable, as well as using the default operator '&'. The first test works with a regular int, while the other uses an array. I'll be working on testing synthetic references to structs, unions and function pointers in the following days.

However, this all works atop pieced_value_funcs.coerce_ref, which has the const-ness issue I mentioned in my previous comment. Any feedback on how to tackle that would be great.
Comment 5 Pedro Alves 2016-04-27 14:11:13 UTC
It's usually easier to give feedback if you first send out your own idea -- we could then build from there.  Off hand, I don't have a definite suggestion, other than saying that something has to give.  Some questions to try to maybe lead somewhere:

- does indirect_pieced_value param need to be non-const? 

- can coerce_ref's param be non-const?  Do you need the whole of 
  indirect_pieced_value

- do you need to whole of indirect_pieced_value?  Could you perhaps factor out
  a subset of indirect_pieced_value that would also work with const values, and 
  use it in both places?
Comment 6 Martin Galvan 2016-04-27 22:18:29 UTC
(In reply to Pedro Alves from comment #5)
> It's usually easier to give feedback if you first send out your own idea --
> we could then build from there. 

If you mean posting what I've done so far, sure. I'll attach a diff with my changes, without the unit tests for the sake of brevity.

> - does indirect_pieced_value param need to be non-const? 

As it's written right now, yes. indirect_pieced_value calls value_contents, which in turn calls value_contents_writeable, which in our case ends up calling value_fetch_lazy.

> - can coerce_ref's param be non-const?

That's precisely my question :) From what I saw, lval_funcs.coerce_ref is called only by coerce_ref_if_computed, whose param is const. That function is called by e.g. generic_val_print_ref, which comes from printing code. IMHO it would be quite ugly to call a printing function and having side effects that somehow alter whatever you want to print.

> - do you need to whole of indirect_pieced_value?  Could you perhaps factor
> out
>   a subset of indirect_pieced_value that would also work with const values,
> and 
>   use it in both places?

Good question. The function that breaks const-ness (value_contents) is called to get a byte offset with which to call dwarf2_evaluate_loc_desc_full. If I understood correctly, at this point we have a dwarf2_locexpr_baton with data on the DIE our synthetic pointer's DW_AT_location points to. I didn't look much into how dwarf2_evaluate_loc_desc_full works, but I'm guessing that byte offset refers to the offset our pointed-to value is within some struct.

All this means we'll probably gonna have to call value_contents (and thus break const-ness) for our use case.

In any case, yesterday I realized I haven't yet tested what happens with synthetic pointers to classes that have RTTI. I'm gonna add a unit test for those later.
Comment 7 Pedro Alves 2016-04-27 22:30:57 UTC
> If you mean posting what I've done so far, sure. I'll attach a diff with my
> changes, without the unit tests for the sake of brevity.

I meant some sort of analysis, and any idea you might have of potential fix, as silly as you could have thought such a fix might have sounded.

Seeing the patch couldn't hurt, maybe someone has an idea.  I'd say push it to a users/ branch, it's trivial to do (faster than asking if people want to
see it ;-)), much easier to look at then an attach here, and doesn't have to be stripped down that way.

> All this means we'll probably gonna have to call value_contents (and thus
> break const-ness) for our use case.

Perhaps we should be making a copy of the value before calling indirect_pieced_value ?  How do other coerce_ref implementations handle this?
Comment 8 Martin Galvan 2016-04-28 13:30:23 UTC
(In reply to Pedro Alves from comment #7)
> I meant some sort of analysis, and any idea you might have of potential fix,
> as silly as you could have thought such a fix might have sounded.

I thought my previous comments were sort of that, actually :)

> Seeing the patch couldn't hurt, maybe someone has an idea.  I'd say push it
> to a users/ branch, it's trivial to do (faster than asking if people want to
> see it ;-)), much easier to look at then an attach here, and doesn't have to
> be stripped down that way.

Oh, I see. Should I push the new branch to origin, or keep it on a separate (e.g. GitHub) repo?

> Perhaps we should be making a copy of the value before calling
> indirect_pieced_value ?  How do other coerce_ref implementations handle this?

Incidentally, value_copy also takes a non-const argument because it calls value_contents_all_raw on the value it's passed.

AFAIK the only struct lval_funcs that actually implements coerce_ref is entry_data_value_funcs. It uses the computed location's "closure", which for entry_data is itself a value. For pieced_value, the closure is a struct piece_closure, which for synthetic pointers has only one piece. That 'piece' in turn has the DIE/offset info required to fetch the referenced value from memory. I'm using indirect_pieced_value precisely because it knows how to interpret that info and fetch the value.
Comment 9 Pedro Alves 2016-04-28 13:51:59 UTC
(In reply to Martin Galvan from comment #8)
> Oh, I see. Should I push the new branch to origin, or keep it on a separate
> (e.g. GitHub) repo?

If by your remote named "origin" is sourceware.org, then yes, that's fine.  Just be sure to name the branch "users/$yourname/$whatever".

> > Perhaps we should be making a copy of the value before calling
> > indirect_pieced_value ?  How do other coerce_ref implementations handle this?
> 
> Incidentally, value_copy also takes a non-const argument because it calls
> value_contents_all_raw on the value it's passed.

  if (!value_lazy (val))
    {
      memcpy (value_contents_all_raw (val), value_contents_all_raw (arg),
	      TYPE_LENGTH (value_enclosing_type (arg)));

    }

The only thing value_contents_all_raw does is make sure there's a value
contents buffer allocated.  But since we know "val" isn't lazy,
we know we already have a contents buffer.  So that could well be changed to read:

  if (!value_lazy (val))
    {
      gdb_assert (arg->contents != NULL);
      memcpy (value_contents_all_raw (val), arg->contents,
	      TYPE_LENGTH (value_enclosing_type (arg)));

    }

> 
> AFAIK the only struct lval_funcs that actually implements coerce_ref is
> entry_data_value_funcs. It uses the computed location's "closure", which for
> entry_data is itself a value. For pieced_value, the closure is a struct
> piece_closure, which for synthetic pointers has only one piece. That 'piece'
> in turn has the DIE/offset info required to fetch the referenced value from
> memory.  I'm using indirect_pieced_value precisely because it knows how to
> interpret that info and fetch the value.

So can't we split indirect_pieced_value in two, around the extract_signed_integer / value_contents call ?

That is, make the second half of indirect_pieced_value be a helper function that takes a byte_offset and a piece (etc. as needed) as parameters, and then have indirect_pieced_value use value_contents to get at the offset, and have coerce_ref use value_content_for_printing_const to get at the offset.  Since synthetic pointers only have one piece, it should be simple to get at it without the complicated loop at the top of indirect_pieced_value, I'd assume.
Comment 10 Martin Galvan 2016-04-28 14:15:30 UTC
(In reply to Pedro Alves from comment #9)
> The only thing value_contents_all_raw does is make sure there's a value
> contents buffer allocated.  But since we know "val" isn't lazy,
> we know we already have a contents buffer.  So that could well be changed to
> read:
> 
>   if (!value_lazy (val))
>     {
>       gdb_assert (arg->contents != NULL);
>       memcpy (value_contents_all_raw (val), arg->contents,
> 	      TYPE_LENGTH (value_enclosing_type (arg)));
> 
>     }

Understood.

> That is, make the second half of indirect_pieced_value be a helper function
> that takes a byte_offset and a piece (etc. as needed) as parameters, and
> then have indirect_pieced_value use value_contents to get at the offset, and
> have coerce_ref use value_content_for_printing_const to get at the offset. 

I see! Yeah, that sounds good. We're always assuming the value isn't lazy anymore though; perhaps an assert at the top of our coerce_ref would help.

In any case, I'm a bit curious about why are we even calling value_contents to get the offset. For synthetic pointers that offset comes from the closure's only piece, yet indirect_pieced_value adds that to the result of value_contents. I'll try out what you suggested first, then I'll see if I can drop the contents fetching altogether for our coerce_ref.
Comment 11 Sourceware Commits 2016-05-31 18:58:19 UTC
The master branch has been updated by Martin Galvan <martingalvan@sourceware.org>:

https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;h=3326303bf5ae4c92f2fbbff387ce231a16c1c8bf

commit 3326303bf5ae4c92f2fbbff387ce231a16c1c8bf
Author: Martin Galvan <martin.galvan@tallertechnologies.com>
Date:   Tue May 31 15:54:01 2016 -0300

    [PR gdb/19893] Fix handling of synthetic C++ references
    
    https://sourceware.org/bugzilla/show_bug.cgi?id=19893
    
    I've traced the main source of the problem to pieced_value_funcs.coerce_ref not being
    implemented. Since gdb always assumes references are implemented as pointers, this
    causes it to think that it's dealing with a NULL pointer, thus breaking any operations
    involving synthetic references.
    
    What I did here was implementing pieced_value_funcs.coerce_ref using some of the synthetic
    pointer handling code from indirect_pieced_value, as Pedro suggested. I also made a few
    adjustments to the reference printing code so that it correctly shows either the address
    of the referenced value or (if it's non-addressable) the "<synthetic pointer>" string.
    
    I also wrote some unit tests based on Dwarf::assemble; these took a while to make
    because in most cases I needed a synthetic reference to a physical variable. Additionally,
    I started working on a unit test for classes that have a vtable, but ran into a few issues
    so that'll probably go in a future patch. One thing that should definitely be fixed is that
    proc function_range (called for MACRO_AT_func) will always try to compile/link using gcc
    with the default options instead of g++, thus breaking C++ compilations that require e.g. libstdc++.
    
    gdb/ChangeLog:
    
    	* dwarf2loc.c (coerce_pieced_ref, indirect_synthetic_pointer,
    	fetch_const_value_from_synthetic_pointer): New functions.
    	(indirect_pieced_value): Move lower half to indirect_synthetic_pointer.
    	(pieced_value_funcs): Implement coerce_ref.
    	* valops.c (value_addr): Call coerce_ref for synthetic references.
    	* valprint.c (valprint_check_validity): Return true for synthetic
    	references.  Also, don't show "<synthetic pointer>" if they reference
    	addressable values.
    	(generic_val_print_ref): Handle synthetic references.  Also move some
    	code to print_ref_address.
    	(print_ref_address, get_value_addr_contents): New functions.
    
    gdb/testsuite/ChangeLog:
    
    	* gdb.dwarf2/implref.exp: Rename to...
    	* gdb.dwarf2/implref-const.exp: ...this.  Also add more test statements.
    	* gdb.dwarf2/implref-array.c: New file.
    	* gdb.dwarf2/implref-array.exp: Likewise.
    	* gdb.dwarf2/implref-global.c: Likewise.
    	* gdb.dwarf2/implref-global.exp: Likewise.
    	* gdb.dwarf2/implref-struct.c: Likewise.
    	* gdb.dwarf2/implref-struct.exp: Likewise.
Comment 12 Martin Galvan 2016-05-31 19:03:24 UTC
Fixed in commit 3326303bf5ae4c92f2fbbff387ce231a16c1c8bf.