[PATCH V4] Fix issues with reading rnglists, especially from dwo files, for DWARF v5

Caroline Tice cmtice@google.com
Thu Jul 9 15:48:44 GMT 2020


I think your description of how ranges work is correct.  I've made all
your requested changes (I think).  Attached is the latest patch for
review. :-)

-- Caroline
cmtice@google.com

gdb/ChangeLog:

2020-07-09  Caroline Tice  <cmtice@google.com>

       * dwarf2/read.c (RNGLIST_HEADER_SIZE32) New constant definition.
        (RNGLIST_HEADER_SIZE64): New constant definition.
        (struct dwop_section_names): Add rnglists_dwo.
        (dwop_section_names): Add .debug_rnglists.dwo, .zdebug_rnglists.dwo.
        (struct dwo_sections): Add rnglists field.
        (read_attribut_reprocess): Add tag parameter.
        (dwarf2_ranges_read): Add tag parameter.
        (cu_debug_rnglists_section): New function (decl & definition).
        (dwarf2_locate_dwo_sections): Add code to read rnglists_dwo section.
        (dwarf2_rnglists_process): Add a dwarf_tag parameter, for the kind of
        die whose range is being checked; add code to read the rnglist section
        from the dwo file rather than from the main objfile, if appropriate.
        Add cases for DW_RLE_base_addressx,
        DW_RLE_startx_length, DW_RLE_startx_endx.  Also, update to only add
        the base address to DW_RLE_offset_pairs (not to all ranges), moving
        test inside if-condition and updating complaint message.
        (dwarf2_ranges_process): Add dwarf tag parameter and pass it to
        dwarf2_rnglists_process.
        (dwarf2_ranges_read): Add dwarf tag parameter and pass it to
        dwarf2_ranges_process.
        (dwarf2_get_pc_bounds): Check for DW_FORM_rnglistx when setting
        need_ranges_base and update comment appropriately.  Also pass die tag
        to dwarf2_ranges_read.
        (dwarf2_record_block_ranges): Check for DW_FORM_rnglistx when setting
        need_ranges_base and update comment appropriately.  Also pass die tag
        to dwarf2_ranges_read.
        (read_full_die_1): Add code to read DW_AT_rnglists_base and assign to
        cu->ranges_base.  Also pass die tag to read_attribute_reprocess.
        (partial_die_info::read): Check for DW_FORM_rnglistx when setting
        need_ranges_base and update comment appropriately.  Also pass die tag
        to read_attribute_reprocess and dwarf2_ranges_read.
        (read_loclist_header): Rename function to read_loclists_rnglists_header,
        and update function comment appropriately.
        (read_loclist_index): Call read_loclists_rnglists_header instead of
        read_loclist_header.
        (read_rnglist_index): New function.
        (read_attribute_reprocess):  Add tag parameter. Add code for
        DW_FORM_rnglistx, passing tag to read_rnglist_index.
        (read_attribute_value): Mark DW_FORM_rnglistx with need_reprocess.

gdb/testsuite/ChangeLog:

2020-07-09  Caroline Tice  <cmtice@google.com>

        * gdb.dwarf2/dw5-rnglist-test.cc: New file.
        * gdb.dwarf2/dw5-rnglist-test.exp: New file.

On Fri, Jul 3, 2020 at 10:11 PM Simon Marchi <simark@simark.ca> wrote:
>
> On 2020-07-03 6:47 p.m., Caroline Tice wrote:
> > "
> >
> > I have made most of your requested changes, which are in the attached
> > patch.  I disagree with one of your comments, and I had a question
> > about another:
> >
> > First the disagreement.  You said:
> >
> >>> @@ -18684,7 +18764,8 @@ partial_die_info::read (const struct die_reader_specs *reader,
> >>>           /* It would be nice to reuse dwarf2_get_pc_bounds here,
> >>>              but that requires a full DIE, so instead we just
> >>>              reimplement it.  */
> >>> -         int need_ranges_base = tag != DW_TAG_compile_unit;
> >>> +         int need_ranges_base = (tag != DW_TAG_compile_unit
> >>> +                                 && attr.form != DW_FORM_rnglistx);
> >>
> >> It looks like this fix is unrelated from the rest of the patch, which deals
> >> about reading rnglists from dwo files.  It's better to have on fix per patch.
> >> So I think this fix would deserve its own patch, with a commit message that
> >> clearly explains the issue and the fix.
> >>
> > This is actually a required fix for dealing with rnglists.  Without
> > it, when the DW_AT_ranges uses the form
> > DW_FORM_rnglistx, the wrong value is calculated and the entire rnglist
> > is ignored (if you turn on complaints this will happen).  So without
> > this fix, I can't test any of the other code for fixing rnglists, with
> > or without .dwo files.
>
> Ah ok, I must have understood it wrong then.  I find all of this really confusing.
>
> I'll summarize what I think I understand, please let me know if I say anything wrong.
> We are dealing with two similar ways of dealing with address ranges, one is the
> pre-standard GNU version here, which was using DW_AT_GNU_ranges_base:
>
>   https://gcc.gnu.org/wiki/DebugFission
>
> and the version in DWARF 5.
>
> In the pre-standard version, only one range section was present, and it was in the
> main linked file.  DW_AT_ranges attributes in the dwo, the form DW_FORM_sec_offset,
> are actually pointing at the section in the main file, at offset
>
>   CU's DW_AT_GNU_ranges_base value (found in the skeleton, in the main file) + DIE's DW_AT_ranges
>
> If there is a DW_AT_ranges attribute in the skeleton CU, in the main linked file,
> then it is absolute, it is _not_ added to the DW_AT_GNU_ranges_base value.  Hence
> the `needs_range_base` thingy.  The `ranges_offset` variable in this function is
> the direct offset to the range list to read.
>
> In DWARF 5, we have a ranges section in the main linked file and one in the dwo.  DW_AT_ranges
> attributes refer to the ranges section of the file they are in.  The DW_AT_rnglists_base
> points to the beginning of the range table for that CU.  The actual value of DW_AT_ranges
> attribute (when using the DW_FORM_rnglistx form) is an index in the offset table.  However,
> due to the reprocessing you added in this patch, the attribute value is now in fact the
> offset with `DW_AT_rnglists_base` already added.  And _that's_ the reason why
> DW_FORM_rnglistx attributes should not have the range base applied.
>
> I think the comment above in the code is misleading and needs to be updated, this one:
>
>           /* DW_AT_rnglists_base does not apply to DIEs from the DWO skeleton.
>              We take advantage of the fact that DW_AT_ranges does not appear
>              in DW_TAG_compile_unit of DWO files.  */
>
> In commit 18a8505e38fc ("Dwarf 5: Handle debug_str_offsets and indexed attributes
> that have base offsets."), the it was changed like so:
>
> -         /* DW_AT_ranges_base does not apply to DIEs from the DWO skeleton.
> +         /* DW_AT_rnglists_base does not apply to DIEs from the DWO skeleton.
>
> I think it was a wrong change, because it mixes two different things.  DW_AT_rnglists_base
> is a DWARF 5 thing, and the range base not applying to the values from the skeleton is a
> pre-standard thing.  I think it should say something like:
>
>           /* DW_AT_GNU_ranges_base does not apply to DIEs from the DWO skeleton.
>              We take advantage of the fact that DW_AT_ranges does not appear
>              in DW_TAG_compile_unit of DWO files.
>
>              Attributes of form DW_AT_rnglistx have already had their value changed
>              by read_rnglist_index and already include DW_AT_rnglists_base, so don't
>              add the ranges base either.  */
>
> >
> >> I'm also not sure I understand it: when the form is DW_FORM_rnglistx, don't
> >> we want to sometimes apply the base coming from DW_AT_rnglists_base?
> >
> > It all depends on what form is being used. If the DW_AT_ranges is
> > using DW_FORM_sec_offset (which is what
> > GCC generates), then you do need to add the base value.But if you
> > are using the DW_FORM_rnglisx (which is what clang generates), the
> > base value has already been considered/incorporated into the attribute
> > so adding it in (again) causes the value to be incorrect (again, turn
> > on complaints and see what happens if you try to add the base into
> > something with DW_FORM_rnglistx).  From the DWARF5 spec: "[A] rnglist
> > is either represented as: An index into the .debug_rnglists section
> > (DW_FORM_rnglistx). The ...operand identifies an offset location
> > relative to the base of that section...[or] An offset into the
> > .debug_rnglists section (DW_FORM_sec_offset).  The operand consists of
> > a byte offset from the beginning of the .debug_rnglists section."
> > Note the DW_FORM_rnglistx is already relative to the base.
>
> Ok.  What was not clear to me in the first pass was that read_rnglist_index
> already has modified the attribute value to include DW_AT_rnglists_base.
>
> Now, about values of class rnglist and form DW_FORM_sec_offset in DWARF 5.
> The spec says this about them:
>
>   - An offset into the .debug_rnglists section (DW_FORM_sec_offset).
>   The operand consists of a byte offset from the beginning of the
>   .debug_rnglists section. It is relocatable in a relocatable object file,
>   and relocated in an executable or shared object file. In the 32-bit
>   DWARF format, this offset is a 4-byte unsigned value; in the 64-bit
>   DWARF format, it is an 8-byte unsigned value (see Section 7.4 on
>   page 196).
>
> Since it says that the value is already relocated in an executable, does it
> mean we should not add the base for these, in DWARF 5?
>
> > Now, my question:  I'm not quite clear as to whether you really want
> > me to change the code in
> > cu_debug_rnglists_section or not.   I wrote the code mostly by copying
> > the code for loclists as closely as possible, and making whatever
> > changes to that were required to make it correct for rnglists.  Is it
> > better to have the code more consistent with the loclists code?If
> > you really want me to make the changes there you suggest, then I will;
>
> I think the situation is a bit different than for `cu_debug_loc_section`:
> when you have a dwo, I don't think there is the possibility of having an
> attribute referring to a location list in the skeleton.  So if you have a
> dwo, you'll always want the loc section from the dwo.  So there, the pattern
>
>   if (there is a dwo)
>     return the dwo section;
>
>   return the main file section;
>
> makes more sense.  But in the case of range lists, we have attributes
> in both the skeleton and the dwo that require range lists, and the right
> section to return depends on where the attribute is.  That's why, when I
> see this pattern:
>
>   if (there is a dwo)
>     return the dwo section;
>
>   return the main file section;
>
> that pattern, I find it looks wrong, because the right section to return
> does not depend (only) on whether there is dwo.  As we saw, it only works
> "by chance" because this is only called once for the skeleton (when
> reading DW_AT_ranges) and the `cu->dwo_unit` pointer happens not to be set
> up yet.  It is also one refactor away from breaking, if for some reason we
> reorder some code and the `cu->dwo_unit` pointer gets initialized earlier.
>
> > I just wanted confirmation that you really want that change (i.e.
> > adding the dwarf tag and checking vs. DW_TAG_compile_unit, & calling
> > it from dwarf2_rnglists_process).
>
> I think that checking DW_TAG_compile_unit is still a bit of a hack - ideally,
> the caller could just tell us if the DIE comes from a dwo or not.  But at
> least it's logical and based on some DWARF knowledge.  It would certainly
> require a comment to explain it, because it wouldn't be obvious.
>
> > @@ -1379,11 +1382,16 @@ static void read_call_site_scope (struct die_info *die, struct dwarf2_cu *cu);
> >  static void read_variable (struct die_info *die, struct dwarf2_cu *cu);
> >
> >  static int dwarf2_ranges_read (unsigned, CORE_ADDR *, CORE_ADDR *,
> > -                            struct dwarf2_cu *, dwarf2_psymtab *);
> > +                            struct dwarf2_cu *, dwarf2_psymtab *,
> > +                            dwarf_tag);
>
> This declaration is unneeded and can simply be removed.  Could you push an
> obvious patch that does this?.
>
>
> > @@ -13917,8 +13969,20 @@ dwarf2_rnglists_process (unsigned offset, struct dwarf2_cu *cu,
> >        if (range_beginning == range_end)
> >       continue;
> >
> > -      range_beginning += *base;
> > -      range_end += *base;
> > +      /* Only DW_RLE_offset_pair needs the base address added.  */
> > +      if (rlet == DW_RLE_offset_pair)
> > +     {
> > +       if (!base.has_value ())
> > +         {
> > +           /* We have no valid base address for the ranges
> > +              data.  */
> > +           complaint (_("Invalid .debug_rnglists data (no base address)"));
>
> The comment fits on a single line.  But it can probably be changed to be more
> specific to DW_RLE_offset_pair.  Maybe the complaint could be more specific too,
> and mention DW_RLE_offset_pair.  It would be valid to have .debug_rnglists without
> a base address (I think?), but here it's invalid because there is not base address
> _and_ we have encountered a DW_RLE_offset_pair.
>
> > @@ -19094,6 +19167,57 @@ read_loclist_index (struct dwarf2_cu *cu, ULONGEST loclist_index)
> >      return bfd_get_64 (abfd, info_ptr) + loclist_base;
> >  }
> >
> > +/* Given a DW_FORM_rnglistx value rnglist_index, fetch the offset from the
> > +   array of offsets in the .debug_rnglists section.  */
> > +static CORE_ADDR
> > +read_rnglist_index (struct dwarf2_cu *cu, ULONGEST rnglist_index)
> > +{
> > +  struct dwarf2_per_objfile *dwarf2_per_objfile = cu->per_objfile;
> > +  struct objfile *objfile = dwarf2_per_objfile->objfile;
> > +  bfd *abfd = objfile->obfd;
> > +  ULONGEST rnglist_header_size =
> > +    (cu->header.initial_length_size == 4 ? LOCLIST_HEADER_SIZE32
> > +     : LOCLIST_HEADER_SIZE64);
>
> Should this really refer to loclist header sizes?  If they are the same
> sizes as range list table headers, I suppose it's just a coincidence.
>
> > +  ULONGEST rnglist_base =
> > +      (cu->dwo_unit) ? rnglist_header_size : cu->ranges_base;
>
> cu->dwo_unit != nullptr
>
> > +  ULONGEST start_offset =
> > +    rnglist_base + rnglist_index * cu->header.offset_size;
> > +
> > +  /* Get rnglists section.  */
> > +  struct dwarf2_section_info *section = cu_debug_rnglists_section (cu);
> > +  if (section == nullptr)
> > +    error (_("Cannot find .debug_rnglists section [in module %s]"),
> > +        objfile_name(objfile));
>
> Space before paren.
>
> > +
> > +  /* Read the rnglists section content.  */
> > +  section->read (objfile);
> > +  if (section->buffer == NULL)
>
> In new code, I suggest using nullptr instead of NULL, just for consistency.
>
> > +    error (_("DW_FORM_rnglistx used without .debug_rnglists section "
> > +          "[in module %s]"),
> > +       objfile_name (objfile));
> > +
> > +  /* Verify the rnglist header is valid (same as loclist header).  */
> > +  struct loclist_header header;
> > +  read_loclist_header (&header, section);
>
> Ok well, if we are going to take advantage that they are the same, the name
> of the function and types should reflect it.  `read_loclist_header` should
> become `read_loclists_rnglists_header` (using the plural to match the section
> names).
>
> > +  if (rnglist_index >= header.offset_entry_count)
> > +    error (_("DW_FORM_rnglistx index pointing outside of "
> > +          ".debug_rnglists offset array [in module %s]"),
> > +         objfile_name(objfile));
>
> Space before paren.
>
> > +
> > +  /* Validate that the offset is within the section's range.  */
> > +  if (start_offset >= section->size)
> > +    error (_("DW_FORM_rnglistx pointing outside of "
> > +             ".debug_rnglists section [in module %s]"),
> > +          objfile_name(objfile));
> > +
> > +  const gdb_byte *info_ptr = section->buffer + start_offset;
> > +
> > +  if (cu->header.offset_size == 4)
> > +    return read_4_bytes (abfd, info_ptr) + rnglist_base;
> > +  else
> > +    return read_8_bytes (abfd, info_ptr) + rnglist_base;
>
> An idea for a subsequent cleanup, we could have a `read_offset` function
> that does
>
>   if (cu->header.offset_size == 4)
>     return read_4_bytes (abfd, info_ptr);
>   else
>     return read_8_bytes (abfd, info_ptr);
>
> There are a few spots that could use it.
>
> Simon
-------------- next part --------------
A non-text attachment was scrubbed...
Name: v4-0001-Update-GDB-to-fix-issues-with-handling-DWARF-v5-r.patch
Type: application/octet-stream
Size: 26969 bytes
Desc: not available
URL: <https://sourceware.org/pipermail/gdb-patches/attachments/20200709/17867d60/attachment-0001.obj>


More information about the Gdb-patches mailing list