[PATCH v3 2/3] Fix range end handling of inlined subroutines

Thu Sep 9 10:42:23 GMT 2021

On 9/8/21 10:53 PM, Bruno Larsen wrote:
> On 9/5/21 4:15 PM, Bernd Edlinger wrote:
>> Currently there is a problem when debugging
>> optimized code when the inferior stops at an inline
>> sub-range end PC.  It is unclear if that location
>> is from the inline function or from the calling
>> function.  Therefore the call stack is often
>> wrong.
>>
>> This patch detects the "weak" line table entries
>> which are likely part of the previous inline block,
>> and if we have such a location, it assumes the
>> location belongs to the previous block.
>>
>> Additionally it may happen that the infrun machinery
>> steps from one inline range to another inline range
>> of the same inline function.  That can look like
>> jumping back and forth from the calling program
>> to the inline function, while really the inline
>> function just jumps from a hot to a cold section
>> of the code, i.e. error handling.
>>
>> Additionally it may happen that one inline sub-range
>> is empty or the inline is completely empty.  But
>> filtering that information away is not the right
>> solution, since although there is no actual code
>> from the inline, it is still possible that variables
>> from an inline function can be inspected here.
>>
>> Conceptually this patch uses a heuristic to work around
>> a deficiency in the dwarf-4 and dwarf-5 rnglists structure.
>> There should be a location view number for each inline
>> sub-range begin PC and end PC, similar to the DW_AT_GNU_entry_view
>> which is the location view for the inline entry point.
>>
>> gdb:
>> 2021-01-01  Bernd Edlinger  <bernd.edlinger@hotmail.de>
>>
>>     * block.c (blockvector_for_pc_sect): For weak line table values,
>>     use the inline frame at PC - 1 if it is deeper.
>>     * dwarf2/read.c (dwarf2_rnglists_process,
>>     dwarf2_ranges_process): Don't ignore empty ranges here.
>>     (dwarf2_ranges_read): Ignore empty ranges here.
>>     (dwarf2_get_pc_bounds): Allow empty inlined subroutines.
>>     (partial_die_info::read): Likewise.
>>     (dwarf_finish_line): Add new parameter end_sequence.
>>     (lnp_state_machine::m_last_file, lnp_state_machine::m_last_address,
>>     lnp_state_machine::m_stmt_at_address): Remove data items.
>>     (lnp_state_machine::record_line): Do not filter line entries here,
>>     except for line == 0.  Pass end_sequence to dwarf_finish_line.
>>     * infcmd.c (prepare_one_step): Ignore empty inline ranges.
>>     * infrun.c (process_event_stop_test): Don't stop in !is_stmt lines
>>     when stepping into inlines.  Don't stop at the call site again, when
>>     stepping within inlined subroutines with multiple ranges.
>>     Don't refresh the step info the when executing !is_stmt lines of
>>     of a different inlined subroutine.
>>     * symtab.c (find_pc_sect_line): Handle is_weak.
>>
>> gdb/testsuite:
>> 2021-01-01  Bernd Edlinger  <bernd.edlinger@hotmail.de>
>>
>>     * gdb.base/empty-inline.exp: New test.
>>     * gdb.base/empty-inline.c: New test.
>>     * gdb.cp/empty-inline.exp: New test.
>>     * gdb.cp/empty-inline.cc: New test.
>>     * gdb.cp/step-and-next-inline.exp: Remove KFAILs, enhance test.
>>     * gdb.cp/step-and-next-inline.cc: Work around failure with
>>     gcc-9.3.0 from ubuntu 20.04.
>> ---
>>   gdb/block.c                                   |  15 ++-
>>   gdb/dwarf2/read.c                             |  84 +++----------
>>   gdb/infcmd.c                                  |   3 +-
>>   gdb/infrun.c                                  |  33 ++++-
>>   gdb/symtab.c                                  |  18 +--
>>   gdb/testsuite/gdb.base/empty-inline.c         |  43 +++++++
>>   gdb/testsuite/gdb.base/empty-inline.exp       |  50 ++++++++
>>   gdb/testsuite/gdb.cp/empty-inline.cc          |  33 +++++
>>   gdb/testsuite/gdb.cp/empty-inline.exp         |  55 ++++++++
>>   gdb/testsuite/gdb.cp/step-and-next-inline.cc  |   6 +
>>   gdb/testsuite/gdb.cp/step-and-next-inline.exp | 174 ++++++++++----------------
>>   11 files changed, 329 insertions(+), 185 deletions(-)
>>   create mode 100644 gdb/testsuite/gdb.base/empty-inline.c
>>   create mode 100644 gdb/testsuite/gdb.base/empty-inline.exp
>>   create mode 100644 gdb/testsuite/gdb.cp/empty-inline.cc
>>   create mode 100644 gdb/testsuite/gdb.cp/empty-inline.exp
>>
> 
> 
> I also have a few questions about this patch. they are as follows:
> 
> diff --git a/gdb/dwarf2/read.c b/gdb/dwarf2/read.c
> index 5385a3a..358266a 100644
> --- a/gdb/dwarf2/read.c
> +++ b/gdb/dwarf2/read.c
> @@ -13838,10 +13838,6 @@ class process_die_scope
>        return false;
>      }
>  
> -      /* Empty range entries have no effect.  */
> -      if (range_beginning == range_end)
> -    continue;
> -
>        /* Only DW_RLE_offset_pair needs the base address added.  */
>        if (rlet == DW_RLE_offset_pair)
>      {
> @@ -13957,10 +13953,6 @@ class process_die_scope
>        return 0;
>      }
>  
> -      /* Empty range entries have no effect.  */
> -      if (range_beginning == range_end)
> -    continue;
> -
>        range_beginning += *base;
>        range_end += *base;
>  
> @@ -14001,6 +13993,10 @@ class process_die_scope
>    retval = dwarf2_ranges_process (offset, cu, tag,
>      [&] (CORE_ADDR range_beginning, CORE_ADDR range_end)
>      {
> +      /* Empty range entries have no effect.  */
> +      if (range_beginning == range_end)
> +    return;
> +
>        if (ranges_pst != NULL)
>      {
>        CORE_ADDR lowpc;
> .......
> 
> Why have you removed the range check on the other functions(dwarf2_rnglists_process and dwarf2_ranges_process), and added one for dwarf2_ranges_read? I don't see how this change is related to the rest of the patch, or why the ifs can be removed on those parts
> 

Note: with this hunk reverted I see:

FAIL: gdb.cp/step-and-next-inline.exp: no_header: abort from inline 1 pass 3
FAIL: gdb.cp/step-and-next-inline.exp: use_header: abort from inline 1 pass 3

There are two invocations of dwarf2_ranges_process.
One here, and another one there:

      std::vector<blockrange> blockvec;
      dwarf2_ranges_process (ranges_offset, cu, die->tag,
        [&] (CORE_ADDR start, CORE_ADDR end)
        {
          start += baseaddr;
          end += baseaddr;
          start = gdbarch_adjust_dwarf2_addr (gdbarch, start);
          end = gdbarch_adjust_dwarf2_addr (gdbarch, end);
          cu->get_builder ()->record_block_range (block, start, end - 1);
          blockvec.emplace_back (start, end);
          if (entry_pc == start && blockvec.size () > 1)
            std::swap (blockvec[0], blockvec[blockvec.size () - 1]);
        });

I want the empty ranges to be recorded in cu->get_builder ()->record_block_range
and also the condition entry_pc == start && blockvec.size () > 1 is possibly
satisfied in an empty block range, although I have not been able to find a
simple test case for this, but I know for sure it may happen.

The overall idea of this patch is to move the decision what to do with empty ranges
to a later stage, when we have access to the line table, and know better if it can
be ignored or not.

>> still on the same file
>  void
> @@ -21097,37 +21083,17 @@ class lnp_state_machine
>    else if (m_op_index == 0 || end_sequence)
>      {
>        fe->included_p = true;
> -      if (m_record_lines_p)
> -    {
> -      /* When we switch files we insert an end maker in the first file,
> -         switch to the second file and add a new line entry.  The
> -         problem is that the end marker inserted in the first file will
> -         discard any previous line entries at the same address.  If the
> -         line entries in the first file are marked as is-stmt, while
> -         the new line in the second file is non-stmt, then this means
> -         the end marker will discard is-stmt lines so we can have a
> -         non-stmt line.  This means that there are less addresses at
> -         which the user can insert a breakpoint.
> -
> -         To improve this we track the last address in m_last_address,
> -         and whether we have seen an is-stmt at this address.  Then
> -         when switching files, if we have seen a stmt at the current
> -         address, and we are switching to create a non-stmt line, then
> -         discard the new line.  */
> -      bool file_changed
> -        = m_last_subfile != m_cu->get_builder ()->get_current_subfile ();
> -      bool ignore_this_line
> -       = ((file_changed && !end_sequence && m_last_address == m_address
> -           && !m_is_stmt && m_stmt_at_address)
> -          || (!end_sequence && m_line == 0));
> -
> -      if ((file_changed && !ignore_this_line) || end_sequence)
> +      if (m_record_lines_p && (end_sequence || m_line != 0))
> +    {
> +      if (m_last_subfile != m_cu->get_builder ()->get_current_subfile ()
> +          || end_sequence)
>          {
>            dwarf_finish_line (m_gdbarch, m_last_subfile, m_address,
> -                 m_currently_recording_lines ? m_cu : nullptr);
> +                 m_currently_recording_lines ? m_cu : nullptr,
> +                 end_sequence || m_is_stmt);
>          }
>  
> -      if (!end_sequence && !ignore_this_line)
> +      if (!end_sequence)
>          {
>            bool is_stmt = producer_is_codewarrior (m_cu) || m_is_stmt;
>  
> @@ -21146,15 +21112,6 @@ class lnp_state_machine
>          }
>      }
>      }
> -
> -  /* Track whether we have seen any m_is_stmt true at m_address in case we
> -     have multiple line table entries all at m_address.  */
> -  if (m_last_address != m_address)
> -    {
> -      m_stmt_at_address = false;
> -      m_last_address = m_address;
> -    }
> -  m_stmt_at_address |= m_is_stmt;
>  }
> 
> Why can the check for overriding an "is_stmt" be removed? Is this related to how inlined functions work? I am not very familiar with gdb code yet, so I can definitely be missing something here.
> 

Yes, this is related.
Previously this used to ignore some line table entries (see "ignore_this_line"),
and not to record those line entries at all.

But at this time we don't know yet if a subroutine range ends here or not.
So the decision what to do with those line table entries is moved to a later
point when we have access to the block structure.

> diff --git a/gdb/symtab.c b/gdb/symtab.c
> index fa3f422..5e95d8a 100644
> --- a/gdb/symtab.c
> +++ b/gdb/symtab.c
> @@ -3269,7 +3269,11 @@ struct symtab_and_line
>       save prev if it represents the end of a function (i.e. line number
>       0) instead of a real line.  */
>  
> -      if (prev && prev->line && (!best || prev->pc > best->pc))
> +      if (prev && prev->line && (!best || prev->pc > best->pc
> +                       || (prev->pc == best->pc
> +                       && (best->pc == pc
> +                           ? !best->is_stmt
> +                           : best->is_weak))))
>      {
>        best = prev;
>        best_symtab = iter_s;
> @@ -3287,7 +3291,7 @@ struct symtab_and_line
>            while (tmp > first && (tmp - 1)->pc == tmp->pc
>               && (tmp - 1)->line != 0 && !tmp->is_stmt)
>          --tmp;
> -          if (tmp->is_stmt)
> +          if (tmp->is_stmt && (tmp->pc == pc || !tmp->is_weak))
>          best = tmp;
>          }
>  
> @@ -3310,18 +3314,14 @@ struct symtab_and_line
>       We used to return alt->line - 1 here, but that could be
>       anywhere; if we don't have line number info for this PC,
>       don't make some up.  */
> -      val.pc = pc;
> -    }
> -  else if (best->line == 0)
> -    {
> -      /* If our best fit is in a range of PC's for which no line
> -     number info is available (line number is zero) then we didn't
> -     find any valid line information.  */
> +      if (notcurrent)
> +    pc++;
>        val.pc = pc;
>      }
>    else
>      {
>        val.is_stmt = best->is_stmt;
> +      val.is_weak = best->is_weak;
>        val.symtab = best_symtab;
>        val.line = best->line;
>        val.pc = best->pc;
> 
> This last hunk is also not obvious to me. Can we assume that if best->line == 0 is true, !best_symtab will be true?
> 

Yes. For the same reason why we don't have to check that best != NULL.
See here:

      if (prev && prev->line && (!best || prev->pc > best->pc
                                       || (prev->pc == best->pc
                                           && (best->pc == pc
                                               ? !best->is_stmt
                                               : best->is_weak))))
        {
          best = prev;
          best_symtab = iter_s;

so best_symtab is only set to something when at the same time best is set to
prev, but the condition is if (prev && prev->line && ...).

The only other place where bset is assigned is here:

          if (!best->is_stmt)
            {
              struct linetable_entry *tmp = best;
              while (tmp > first && (tmp - 1)->pc == tmp->pc
                     && (tmp - 1)->line != 0 && !tmp->is_stmt)
                --tmp;
              if (tmp->is_stmt && (tmp->pc == pc || !tmp->is_weak))
                best = tmp;

and again we check for (tmp - 1)->line != 0 before decrementing tmp.
So indeed we know that best->line != 0 if best_symtab != NULL.

Thanks
Bernd.