[PATCH V4] Fix issues with reading rnglists, especially from dwo files, for DWARF v5
Caroline Tice
cmtice@google.com
Thu Jul 9 15:48:44 GMT 2020
I think your description of how ranges work is correct. I've made all
your requested changes (I think). Attached is the latest patch for
review. :-)
-- Caroline
cmtice@google.com
gdb/ChangeLog:
2020-07-09 Caroline Tice <cmtice@google.com>
* dwarf2/read.c (RNGLIST_HEADER_SIZE32) New constant definition.
(RNGLIST_HEADER_SIZE64): New constant definition.
(struct dwop_section_names): Add rnglists_dwo.
(dwop_section_names): Add .debug_rnglists.dwo, .zdebug_rnglists.dwo.
(struct dwo_sections): Add rnglists field.
(read_attribut_reprocess): Add tag parameter.
(dwarf2_ranges_read): Add tag parameter.
(cu_debug_rnglists_section): New function (decl & definition).
(dwarf2_locate_dwo_sections): Add code to read rnglists_dwo section.
(dwarf2_rnglists_process): Add a dwarf_tag parameter, for the kind of
die whose range is being checked; add code to read the rnglist section
from the dwo file rather than from the main objfile, if appropriate.
Add cases for DW_RLE_base_addressx,
DW_RLE_startx_length, DW_RLE_startx_endx. Also, update to only add
the base address to DW_RLE_offset_pairs (not to all ranges), moving
test inside if-condition and updating complaint message.
(dwarf2_ranges_process): Add dwarf tag parameter and pass it to
dwarf2_rnglists_process.
(dwarf2_ranges_read): Add dwarf tag parameter and pass it to
dwarf2_ranges_process.
(dwarf2_get_pc_bounds): Check for DW_FORM_rnglistx when setting
need_ranges_base and update comment appropriately. Also pass die tag
to dwarf2_ranges_read.
(dwarf2_record_block_ranges): Check for DW_FORM_rnglistx when setting
need_ranges_base and update comment appropriately. Also pass die tag
to dwarf2_ranges_read.
(read_full_die_1): Add code to read DW_AT_rnglists_base and assign to
cu->ranges_base. Also pass die tag to read_attribute_reprocess.
(partial_die_info::read): Check for DW_FORM_rnglistx when setting
need_ranges_base and update comment appropriately. Also pass die tag
to read_attribute_reprocess and dwarf2_ranges_read.
(read_loclist_header): Rename function to read_loclists_rnglists_header,
and update function comment appropriately.
(read_loclist_index): Call read_loclists_rnglists_header instead of
read_loclist_header.
(read_rnglist_index): New function.
(read_attribute_reprocess): Add tag parameter. Add code for
DW_FORM_rnglistx, passing tag to read_rnglist_index.
(read_attribute_value): Mark DW_FORM_rnglistx with need_reprocess.
gdb/testsuite/ChangeLog:
2020-07-09 Caroline Tice <cmtice@google.com>
* gdb.dwarf2/dw5-rnglist-test.cc: New file.
* gdb.dwarf2/dw5-rnglist-test.exp: New file.
On Fri, Jul 3, 2020 at 10:11 PM Simon Marchi <simark@simark.ca> wrote:
>
> On 2020-07-03 6:47 p.m., Caroline Tice wrote:
> > "
> >
> > I have made most of your requested changes, which are in the attached
> > patch. I disagree with one of your comments, and I had a question
> > about another:
> >
> > First the disagreement. You said:
> >
> >>> @@ -18684,7 +18764,8 @@ partial_die_info::read (const struct die_reader_specs *reader,
> >>> /* It would be nice to reuse dwarf2_get_pc_bounds here,
> >>> but that requires a full DIE, so instead we just
> >>> reimplement it. */
> >>> - int need_ranges_base = tag != DW_TAG_compile_unit;
> >>> + int need_ranges_base = (tag != DW_TAG_compile_unit
> >>> + && attr.form != DW_FORM_rnglistx);
> >>
> >> It looks like this fix is unrelated from the rest of the patch, which deals
> >> about reading rnglists from dwo files. It's better to have on fix per patch.
> >> So I think this fix would deserve its own patch, with a commit message that
> >> clearly explains the issue and the fix.
> >>
> > This is actually a required fix for dealing with rnglists. Without
> > it, when the DW_AT_ranges uses the form
> > DW_FORM_rnglistx, the wrong value is calculated and the entire rnglist
> > is ignored (if you turn on complaints this will happen). So without
> > this fix, I can't test any of the other code for fixing rnglists, with
> > or without .dwo files.
>
> Ah ok, I must have understood it wrong then. I find all of this really confusing.
>
> I'll summarize what I think I understand, please let me know if I say anything wrong.
> We are dealing with two similar ways of dealing with address ranges, one is the
> pre-standard GNU version here, which was using DW_AT_GNU_ranges_base:
>
> https://gcc.gnu.org/wiki/DebugFission
>
> and the version in DWARF 5.
>
> In the pre-standard version, only one range section was present, and it was in the
> main linked file. DW_AT_ranges attributes in the dwo, the form DW_FORM_sec_offset,
> are actually pointing at the section in the main file, at offset
>
> CU's DW_AT_GNU_ranges_base value (found in the skeleton, in the main file) + DIE's DW_AT_ranges
>
> If there is a DW_AT_ranges attribute in the skeleton CU, in the main linked file,
> then it is absolute, it is _not_ added to the DW_AT_GNU_ranges_base value. Hence
> the `needs_range_base` thingy. The `ranges_offset` variable in this function is
> the direct offset to the range list to read.
>
> In DWARF 5, we have a ranges section in the main linked file and one in the dwo. DW_AT_ranges
> attributes refer to the ranges section of the file they are in. The DW_AT_rnglists_base
> points to the beginning of the range table for that CU. The actual value of DW_AT_ranges
> attribute (when using the DW_FORM_rnglistx form) is an index in the offset table. However,
> due to the reprocessing you added in this patch, the attribute value is now in fact the
> offset with `DW_AT_rnglists_base` already added. And _that's_ the reason why
> DW_FORM_rnglistx attributes should not have the range base applied.
>
> I think the comment above in the code is misleading and needs to be updated, this one:
>
> /* DW_AT_rnglists_base does not apply to DIEs from the DWO skeleton.
> We take advantage of the fact that DW_AT_ranges does not appear
> in DW_TAG_compile_unit of DWO files. */
>
> In commit 18a8505e38fc ("Dwarf 5: Handle debug_str_offsets and indexed attributes
> that have base offsets."), the it was changed like so:
>
> - /* DW_AT_ranges_base does not apply to DIEs from the DWO skeleton.
> + /* DW_AT_rnglists_base does not apply to DIEs from the DWO skeleton.
>
> I think it was a wrong change, because it mixes two different things. DW_AT_rnglists_base
> is a DWARF 5 thing, and the range base not applying to the values from the skeleton is a
> pre-standard thing. I think it should say something like:
>
> /* DW_AT_GNU_ranges_base does not apply to DIEs from the DWO skeleton.
> We take advantage of the fact that DW_AT_ranges does not appear
> in DW_TAG_compile_unit of DWO files.
>
> Attributes of form DW_AT_rnglistx have already had their value changed
> by read_rnglist_index and already include DW_AT_rnglists_base, so don't
> add the ranges base either. */
>
> >
> >> I'm also not sure I understand it: when the form is DW_FORM_rnglistx, don't
> >> we want to sometimes apply the base coming from DW_AT_rnglists_base?
> >
> > It all depends on what form is being used. If the DW_AT_ranges is
> > using DW_FORM_sec_offset (which is what
> > GCC generates), then you do need to add the base value.But if you
> > are using the DW_FORM_rnglisx (which is what clang generates), the
> > base value has already been considered/incorporated into the attribute
> > so adding it in (again) causes the value to be incorrect (again, turn
> > on complaints and see what happens if you try to add the base into
> > something with DW_FORM_rnglistx). From the DWARF5 spec: "[A] rnglist
> > is either represented as: An index into the .debug_rnglists section
> > (DW_FORM_rnglistx). The ...operand identifies an offset location
> > relative to the base of that section...[or] An offset into the
> > .debug_rnglists section (DW_FORM_sec_offset). The operand consists of
> > a byte offset from the beginning of the .debug_rnglists section."
> > Note the DW_FORM_rnglistx is already relative to the base.
>
> Ok. What was not clear to me in the first pass was that read_rnglist_index
> already has modified the attribute value to include DW_AT_rnglists_base.
>
> Now, about values of class rnglist and form DW_FORM_sec_offset in DWARF 5.
> The spec says this about them:
>
> - An offset into the .debug_rnglists section (DW_FORM_sec_offset).
> The operand consists of a byte offset from the beginning of the
> .debug_rnglists section. It is relocatable in a relocatable object file,
> and relocated in an executable or shared object file. In the 32-bit
> DWARF format, this offset is a 4-byte unsigned value; in the 64-bit
> DWARF format, it is an 8-byte unsigned value (see Section 7.4 on
> page 196).
>
> Since it says that the value is already relocated in an executable, does it
> mean we should not add the base for these, in DWARF 5?
>
> > Now, my question: I'm not quite clear as to whether you really want
> > me to change the code in
> > cu_debug_rnglists_section or not. I wrote the code mostly by copying
> > the code for loclists as closely as possible, and making whatever
> > changes to that were required to make it correct for rnglists. Is it
> > better to have the code more consistent with the loclists code?If
> > you really want me to make the changes there you suggest, then I will;
>
> I think the situation is a bit different than for `cu_debug_loc_section`:
> when you have a dwo, I don't think there is the possibility of having an
> attribute referring to a location list in the skeleton. So if you have a
> dwo, you'll always want the loc section from the dwo. So there, the pattern
>
> if (there is a dwo)
> return the dwo section;
>
> return the main file section;
>
> makes more sense. But in the case of range lists, we have attributes
> in both the skeleton and the dwo that require range lists, and the right
> section to return depends on where the attribute is. That's why, when I
> see this pattern:
>
> if (there is a dwo)
> return the dwo section;
>
> return the main file section;
>
> that pattern, I find it looks wrong, because the right section to return
> does not depend (only) on whether there is dwo. As we saw, it only works
> "by chance" because this is only called once for the skeleton (when
> reading DW_AT_ranges) and the `cu->dwo_unit` pointer happens not to be set
> up yet. It is also one refactor away from breaking, if for some reason we
> reorder some code and the `cu->dwo_unit` pointer gets initialized earlier.
>
> > I just wanted confirmation that you really want that change (i.e.
> > adding the dwarf tag and checking vs. DW_TAG_compile_unit, & calling
> > it from dwarf2_rnglists_process).
>
> I think that checking DW_TAG_compile_unit is still a bit of a hack - ideally,
> the caller could just tell us if the DIE comes from a dwo or not. But at
> least it's logical and based on some DWARF knowledge. It would certainly
> require a comment to explain it, because it wouldn't be obvious.
>
> > @@ -1379,11 +1382,16 @@ static void read_call_site_scope (struct die_info *die, struct dwarf2_cu *cu);
> > static void read_variable (struct die_info *die, struct dwarf2_cu *cu);
> >
> > static int dwarf2_ranges_read (unsigned, CORE_ADDR *, CORE_ADDR *,
> > - struct dwarf2_cu *, dwarf2_psymtab *);
> > + struct dwarf2_cu *, dwarf2_psymtab *,
> > + dwarf_tag);
>
> This declaration is unneeded and can simply be removed. Could you push an
> obvious patch that does this?.
>
>
> > @@ -13917,8 +13969,20 @@ dwarf2_rnglists_process (unsigned offset, struct dwarf2_cu *cu,
> > if (range_beginning == range_end)
> > continue;
> >
> > - range_beginning += *base;
> > - range_end += *base;
> > + /* Only DW_RLE_offset_pair needs the base address added. */
> > + if (rlet == DW_RLE_offset_pair)
> > + {
> > + if (!base.has_value ())
> > + {
> > + /* We have no valid base address for the ranges
> > + data. */
> > + complaint (_("Invalid .debug_rnglists data (no base address)"));
>
> The comment fits on a single line. But it can probably be changed to be more
> specific to DW_RLE_offset_pair. Maybe the complaint could be more specific too,
> and mention DW_RLE_offset_pair. It would be valid to have .debug_rnglists without
> a base address (I think?), but here it's invalid because there is not base address
> _and_ we have encountered a DW_RLE_offset_pair.
>
> > @@ -19094,6 +19167,57 @@ read_loclist_index (struct dwarf2_cu *cu, ULONGEST loclist_index)
> > return bfd_get_64 (abfd, info_ptr) + loclist_base;
> > }
> >
> > +/* Given a DW_FORM_rnglistx value rnglist_index, fetch the offset from the
> > + array of offsets in the .debug_rnglists section. */
> > +static CORE_ADDR
> > +read_rnglist_index (struct dwarf2_cu *cu, ULONGEST rnglist_index)
> > +{
> > + struct dwarf2_per_objfile *dwarf2_per_objfile = cu->per_objfile;
> > + struct objfile *objfile = dwarf2_per_objfile->objfile;
> > + bfd *abfd = objfile->obfd;
> > + ULONGEST rnglist_header_size =
> > + (cu->header.initial_length_size == 4 ? LOCLIST_HEADER_SIZE32
> > + : LOCLIST_HEADER_SIZE64);
>
> Should this really refer to loclist header sizes? If they are the same
> sizes as range list table headers, I suppose it's just a coincidence.
>
> > + ULONGEST rnglist_base =
> > + (cu->dwo_unit) ? rnglist_header_size : cu->ranges_base;
>
> cu->dwo_unit != nullptr
>
> > + ULONGEST start_offset =
> > + rnglist_base + rnglist_index * cu->header.offset_size;
> > +
> > + /* Get rnglists section. */
> > + struct dwarf2_section_info *section = cu_debug_rnglists_section (cu);
> > + if (section == nullptr)
> > + error (_("Cannot find .debug_rnglists section [in module %s]"),
> > + objfile_name(objfile));
>
> Space before paren.
>
> > +
> > + /* Read the rnglists section content. */
> > + section->read (objfile);
> > + if (section->buffer == NULL)
>
> In new code, I suggest using nullptr instead of NULL, just for consistency.
>
> > + error (_("DW_FORM_rnglistx used without .debug_rnglists section "
> > + "[in module %s]"),
> > + objfile_name (objfile));
> > +
> > + /* Verify the rnglist header is valid (same as loclist header). */
> > + struct loclist_header header;
> > + read_loclist_header (&header, section);
>
> Ok well, if we are going to take advantage that they are the same, the name
> of the function and types should reflect it. `read_loclist_header` should
> become `read_loclists_rnglists_header` (using the plural to match the section
> names).
>
> > + if (rnglist_index >= header.offset_entry_count)
> > + error (_("DW_FORM_rnglistx index pointing outside of "
> > + ".debug_rnglists offset array [in module %s]"),
> > + objfile_name(objfile));
>
> Space before paren.
>
> > +
> > + /* Validate that the offset is within the section's range. */
> > + if (start_offset >= section->size)
> > + error (_("DW_FORM_rnglistx pointing outside of "
> > + ".debug_rnglists section [in module %s]"),
> > + objfile_name(objfile));
> > +
> > + const gdb_byte *info_ptr = section->buffer + start_offset;
> > +
> > + if (cu->header.offset_size == 4)
> > + return read_4_bytes (abfd, info_ptr) + rnglist_base;
> > + else
> > + return read_8_bytes (abfd, info_ptr) + rnglist_base;
>
> An idea for a subsequent cleanup, we could have a `read_offset` function
> that does
>
> if (cu->header.offset_size == 4)
> return read_4_bytes (abfd, info_ptr);
> else
> return read_8_bytes (abfd, info_ptr);
>
> There are a few spots that could use it.
>
> Simon
-------------- next part --------------
A non-text attachment was scrubbed...
Name: v4-0001-Update-GDB-to-fix-issues-with-handling-DWARF-v5-r.patch
Type: application/octet-stream
Size: 26969 bytes
Desc: not available
URL: <https://sourceware.org/pipermail/gdb-patches/attachments/20200709/17867d60/attachment-0001.obj>
More information about the Gdb-patches
mailing list