Summary: | Synthetic pointers created from C++ references are broken | ||
---|---|---|---|
Product: | gdb | Reporter: | Martin Galvan <martingalvan> |
Component: | c++ | Assignee: | Martin Galvan <martingalvan> |
Status: | RESOLVED FIXED | ||
Severity: | normal | CC: | danielgutson, jan, luis.machado, martingalvan, pedro, tromey |
Priority: | P2 | ||
Version: | HEAD | ||
Target Milestone: | --- | ||
Host: | Target: | ||
Build: | Last reconfirmed: | ||
Attachments: | Example program that triggers the bug (compile with gcc -O3 and try to print 'arg') |
Description
Martin Galvan
2016-03-31 13:37:38 UTC
I've been looking a bit more into this and I think the crash from case 1 comes from a different issue. It seems that value_addr should also set arg2's enclosing_type to arg1's (as a pointer instead of a ref). I'm working on that patch (and a corresponding unit test) now; I think this can go in now before I keep working on fixing the other cases. I've sent a patch for fixing the crash when doing 'print &ref', here: https://sourceware.org/ml/gdb-patches/2016-04/msg00012.html As for the other cases, I've traced the root cause to pieced_value_funcs.coerce_ref being NULL after commit a471c5941e127823b95893176c7c9301c4b4cb32. I'm thinking of implementing this function for synthetic references only, so that calls to coerce_ref_if_computed will stop returning NULL. I'm open to suggestions or feedback of any kind though. Hi everyone, I've fixed cases 2 and 3 by implementing pieced_value_funcs.coerce_ref as a wrapper for indirect_pieced_value that works only for synthetic references. I changed the check in indirect_pieced_value so that it works with both TYPE_CODE_PTR and TYPE_CODE_REF, since gdb always assumes they're both pointers. While that seems to work, there's an issue: right now pieced_value_funcs.coerce_ref takes a const value which represents the reference, while indirect_pieced_value's argument isn't const. Dwelling deeper I saw that this is correct: indirect_pieced_value needs to fetch the value's contents (i.e. the pointer's value), which ends up calling value_fetch_lazy. However, this breaks my approach unless I de-const the value all the way up from pieced_value_funcs.coerce_ref (which is ugly, especially when done on printing functions). How could I go about doing this differently? Is there something I'm missing here? Some progress: I've written a couple of unit tests that use a synthetic reference to a physical variable. This allows to test assignment through the synthetic variable, as well as using the default operator '&'. The first test works with a regular int, while the other uses an array. I'll be working on testing synthetic references to structs, unions and function pointers in the following days. However, this all works atop pieced_value_funcs.coerce_ref, which has the const-ness issue I mentioned in my previous comment. Any feedback on how to tackle that would be great. It's usually easier to give feedback if you first send out your own idea -- we could then build from there. Off hand, I don't have a definite suggestion, other than saying that something has to give. Some questions to try to maybe lead somewhere: - does indirect_pieced_value param need to be non-const? - can coerce_ref's param be non-const? Do you need the whole of indirect_pieced_value - do you need to whole of indirect_pieced_value? Could you perhaps factor out a subset of indirect_pieced_value that would also work with const values, and use it in both places? (In reply to Pedro Alves from comment #5) > It's usually easier to give feedback if you first send out your own idea -- > we could then build from there. If you mean posting what I've done so far, sure. I'll attach a diff with my changes, without the unit tests for the sake of brevity. > - does indirect_pieced_value param need to be non-const? As it's written right now, yes. indirect_pieced_value calls value_contents, which in turn calls value_contents_writeable, which in our case ends up calling value_fetch_lazy. > - can coerce_ref's param be non-const? That's precisely my question :) From what I saw, lval_funcs.coerce_ref is called only by coerce_ref_if_computed, whose param is const. That function is called by e.g. generic_val_print_ref, which comes from printing code. IMHO it would be quite ugly to call a printing function and having side effects that somehow alter whatever you want to print. > - do you need to whole of indirect_pieced_value? Could you perhaps factor > out > a subset of indirect_pieced_value that would also work with const values, > and > use it in both places? Good question. The function that breaks const-ness (value_contents) is called to get a byte offset with which to call dwarf2_evaluate_loc_desc_full. If I understood correctly, at this point we have a dwarf2_locexpr_baton with data on the DIE our synthetic pointer's DW_AT_location points to. I didn't look much into how dwarf2_evaluate_loc_desc_full works, but I'm guessing that byte offset refers to the offset our pointed-to value is within some struct. All this means we'll probably gonna have to call value_contents (and thus break const-ness) for our use case. In any case, yesterday I realized I haven't yet tested what happens with synthetic pointers to classes that have RTTI. I'm gonna add a unit test for those later. > If you mean posting what I've done so far, sure. I'll attach a diff with my > changes, without the unit tests for the sake of brevity. I meant some sort of analysis, and any idea you might have of potential fix, as silly as you could have thought such a fix might have sounded. Seeing the patch couldn't hurt, maybe someone has an idea. I'd say push it to a users/ branch, it's trivial to do (faster than asking if people want to see it ;-)), much easier to look at then an attach here, and doesn't have to be stripped down that way. > All this means we'll probably gonna have to call value_contents (and thus > break const-ness) for our use case. Perhaps we should be making a copy of the value before calling indirect_pieced_value ? How do other coerce_ref implementations handle this? (In reply to Pedro Alves from comment #7) > I meant some sort of analysis, and any idea you might have of potential fix, > as silly as you could have thought such a fix might have sounded. I thought my previous comments were sort of that, actually :) > Seeing the patch couldn't hurt, maybe someone has an idea. I'd say push it > to a users/ branch, it's trivial to do (faster than asking if people want to > see it ;-)), much easier to look at then an attach here, and doesn't have to > be stripped down that way. Oh, I see. Should I push the new branch to origin, or keep it on a separate (e.g. GitHub) repo? > Perhaps we should be making a copy of the value before calling > indirect_pieced_value ? How do other coerce_ref implementations handle this? Incidentally, value_copy also takes a non-const argument because it calls value_contents_all_raw on the value it's passed. AFAIK the only struct lval_funcs that actually implements coerce_ref is entry_data_value_funcs. It uses the computed location's "closure", which for entry_data is itself a value. For pieced_value, the closure is a struct piece_closure, which for synthetic pointers has only one piece. That 'piece' in turn has the DIE/offset info required to fetch the referenced value from memory. I'm using indirect_pieced_value precisely because it knows how to interpret that info and fetch the value. (In reply to Martin Galvan from comment #8) > Oh, I see. Should I push the new branch to origin, or keep it on a separate > (e.g. GitHub) repo? If by your remote named "origin" is sourceware.org, then yes, that's fine. Just be sure to name the branch "users/$yourname/$whatever". > > Perhaps we should be making a copy of the value before calling > > indirect_pieced_value ? How do other coerce_ref implementations handle this? > > Incidentally, value_copy also takes a non-const argument because it calls > value_contents_all_raw on the value it's passed. if (!value_lazy (val)) { memcpy (value_contents_all_raw (val), value_contents_all_raw (arg), TYPE_LENGTH (value_enclosing_type (arg))); } The only thing value_contents_all_raw does is make sure there's a value contents buffer allocated. But since we know "val" isn't lazy, we know we already have a contents buffer. So that could well be changed to read: if (!value_lazy (val)) { gdb_assert (arg->contents != NULL); memcpy (value_contents_all_raw (val), arg->contents, TYPE_LENGTH (value_enclosing_type (arg))); } > > AFAIK the only struct lval_funcs that actually implements coerce_ref is > entry_data_value_funcs. It uses the computed location's "closure", which for > entry_data is itself a value. For pieced_value, the closure is a struct > piece_closure, which for synthetic pointers has only one piece. That 'piece' > in turn has the DIE/offset info required to fetch the referenced value from > memory. I'm using indirect_pieced_value precisely because it knows how to > interpret that info and fetch the value. So can't we split indirect_pieced_value in two, around the extract_signed_integer / value_contents call ? That is, make the second half of indirect_pieced_value be a helper function that takes a byte_offset and a piece (etc. as needed) as parameters, and then have indirect_pieced_value use value_contents to get at the offset, and have coerce_ref use value_content_for_printing_const to get at the offset. Since synthetic pointers only have one piece, it should be simple to get at it without the complicated loop at the top of indirect_pieced_value, I'd assume. (In reply to Pedro Alves from comment #9) > The only thing value_contents_all_raw does is make sure there's a value > contents buffer allocated. But since we know "val" isn't lazy, > we know we already have a contents buffer. So that could well be changed to > read: > > if (!value_lazy (val)) > { > gdb_assert (arg->contents != NULL); > memcpy (value_contents_all_raw (val), arg->contents, > TYPE_LENGTH (value_enclosing_type (arg))); > > } Understood. > That is, make the second half of indirect_pieced_value be a helper function > that takes a byte_offset and a piece (etc. as needed) as parameters, and > then have indirect_pieced_value use value_contents to get at the offset, and > have coerce_ref use value_content_for_printing_const to get at the offset. I see! Yeah, that sounds good. We're always assuming the value isn't lazy anymore though; perhaps an assert at the top of our coerce_ref would help. In any case, I'm a bit curious about why are we even calling value_contents to get the offset. For synthetic pointers that offset comes from the closure's only piece, yet indirect_pieced_value adds that to the result of value_contents. I'll try out what you suggested first, then I'll see if I can drop the contents fetching altogether for our coerce_ref. The master branch has been updated by Martin Galvan <martingalvan@sourceware.org>: https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;h=3326303bf5ae4c92f2fbbff387ce231a16c1c8bf commit 3326303bf5ae4c92f2fbbff387ce231a16c1c8bf Author: Martin Galvan <martin.galvan@tallertechnologies.com> Date: Tue May 31 15:54:01 2016 -0300 [PR gdb/19893] Fix handling of synthetic C++ references https://sourceware.org/bugzilla/show_bug.cgi?id=19893 I've traced the main source of the problem to pieced_value_funcs.coerce_ref not being implemented. Since gdb always assumes references are implemented as pointers, this causes it to think that it's dealing with a NULL pointer, thus breaking any operations involving synthetic references. What I did here was implementing pieced_value_funcs.coerce_ref using some of the synthetic pointer handling code from indirect_pieced_value, as Pedro suggested. I also made a few adjustments to the reference printing code so that it correctly shows either the address of the referenced value or (if it's non-addressable) the "<synthetic pointer>" string. I also wrote some unit tests based on Dwarf::assemble; these took a while to make because in most cases I needed a synthetic reference to a physical variable. Additionally, I started working on a unit test for classes that have a vtable, but ran into a few issues so that'll probably go in a future patch. One thing that should definitely be fixed is that proc function_range (called for MACRO_AT_func) will always try to compile/link using gcc with the default options instead of g++, thus breaking C++ compilations that require e.g. libstdc++. gdb/ChangeLog: * dwarf2loc.c (coerce_pieced_ref, indirect_synthetic_pointer, fetch_const_value_from_synthetic_pointer): New functions. (indirect_pieced_value): Move lower half to indirect_synthetic_pointer. (pieced_value_funcs): Implement coerce_ref. * valops.c (value_addr): Call coerce_ref for synthetic references. * valprint.c (valprint_check_validity): Return true for synthetic references. Also, don't show "<synthetic pointer>" if they reference addressable values. (generic_val_print_ref): Handle synthetic references. Also move some code to print_ref_address. (print_ref_address, get_value_addr_contents): New functions. gdb/testsuite/ChangeLog: * gdb.dwarf2/implref.exp: Rename to... * gdb.dwarf2/implref-const.exp: ...this. Also add more test statements. * gdb.dwarf2/implref-array.c: New file. * gdb.dwarf2/implref-array.exp: Likewise. * gdb.dwarf2/implref-global.c: Likewise. * gdb.dwarf2/implref-global.exp: Likewise. * gdb.dwarf2/implref-struct.c: Likewise. * gdb.dwarf2/implref-struct.exp: Likewise. Fixed in commit 3326303bf5ae4c92f2fbbff387ce231a16c1c8bf. |