[PATCHv3] Fix range end handling of inlined subroutines
Andrew Burgess
andrew.burgess@embecosm.com
Wed Mar 11 22:02:32 GMT 2020
* Bernd Edlinger <bernd.edlinger@hotmail.de> [2020-03-08 14:57:35 +0000]:
> On 2/22/20 7:38 AM, Bernd Edlinger wrote:
> > On 2/9/20 10:07 PM, Bernd Edlinger wrote:
> >> Hi,
> >>
> >> this is based on Andrew's patch here:
> >>
> >> https://sourceware.org/ml/gdb-patches/2020-02/msg00092.html
> >>
> >> This and adds a heuristic to fix the case where caller
> >> and callee reside in the same subfile, it uses
> >> the recorded is-stmt bits and locates end of
> >> range infos, including the previously ignored empty
> >> range, and adjusting is-stmt info at this
> >> same pc, when the last item is not-is-stmt, the
> >> previous line table entries are dubious and
> >> we reset the is-stmt bit there.
> >> This fixes the new test case in Andrew's patch.
> >>
> >> It understood, that this is just a heuristic
> >> approach, since we do not have exactly the data,
> >> which is needed to determine at which of the identical
> >> PCs in the line table the subroutine actually ends.
> >> So, this tries to be as conservative as possible,
> >> just changing what is absolutely necessary.
> >>
> >> This patch itself is preliminary, since the is-stmt patch
> >> needs to be rebased after the refactoring of
> >> dwarf2read.c from yesterday, so I will have to rebase
> >> this patch as well, but decided to wait for Andrew.
> >>
> >>
> >
> > So, this is an update to my previous patch above:
> > https://sourceware.org/ml/gdb-patches/2020-02/msg00262.html
> >
> > It improves performance on big data, by using binary
> > search to locate the bogus line table entries.
> > Otherwise it behaves identical to the previous version,
> > and is still waiting for Andrew's patch before it can
> > be applied.
> >
> >
>
> Rebased to match Andrew's updated patch of today.
Bernd,
Thanks for working on this. This looks like a great improvement, but
I have a could of questions / comment inline below.
Thanks,
Andrew
>
>
> Thanks
> Bernd.
> From 010f1774ec69d5cab48db9b7a5fb9de3d5860f97 Mon Sep 17 00:00:00 2001
> From: Bernd Edlinger <bernd.edlinger@hotmail.de>
> Date: Sun, 9 Feb 2020 21:13:17 +0100
> Subject: [PATCH] Fix range end handling of inlined subroutines
>
> Since the is_stmt is now handled, it becomes
> possible to locate dubious is_stmt line entries
> at the end of an inlined function, even if the
> called inline function is in the same subfile.
>
> When there is a sequence of line entries at the
> same address where an inline range ends, and the
> last item has is_stmt = 0, we force all previous
> items to have is_stmt = 0 as well.
This describes _what_ we're doing...
>
> If the last line at that address has is_stmt = 1,
> there is no need to change anything, since a step
> over will always stop at that last line from the
> same address, which is fine, since it is outside
> the subroutine.
This describes what we're _not_ doing...
... it just feels like this description would benefit for a bit more
_why_. How does setting all the is_stmt fields to 0 help us? What
cases step/next should we expect to see improvement in?
I'm also worried that setting is_stmt to false will mean we loose the
ability to set breakpoints on some lines using file:line syntax. I
wonder if we should consider coming up with a solution that both
preserves the ability to set breakpoints, while also giving the
benefits for step/next.
>
> This is about the best we can do at the moment,
> unless location view information are added to the
> block ranges debug info structure, and location
> views are implemented in gdb in general.
>
> gdb:
> 2020-03-08 Bernd Edlinger <bernd.edlinger@hotmail.de>
>
> * buildsym.c (buildsym_compunit::record_inline_range_end,
> patch_inline_end_pos): New helper functions.
> (buildsym_compunit::end_symtab_with_blockvector): Patch line table.
> (buildsym_compunit::~buildsym_compunit): Cleanup m_inline_end_vector.
> * buildsym.h (buildsym_compunit::record_inline_range_end): Declare.
> (buildsym_compunit::m_inline_end_vector,
> buildsym_compunit::m_inline_end_vector_length,
> buildsym_compunit::m_inline_end_vector_nitems): New data items.
> * dwarf2/read.c (dwarf2_rnglists_process,
> dwarf2_ranges_process): Don't ignore empty ranges here.
> (dwarf2_ranges_read): Ignore empty ranges here.
> (dwarf2_record_block_ranges): Pass end of range PC to
> record_inline_range_end for inline functions.
>
> gdb/testsuite:
> 2020-03-08 Bernd Edlinger <bernd.edlinger@hotmail.de>
>
> * gdb.cp/step-and-next-inline.exp: Adjust test.
> ---
> gdb/buildsym.c | 84 +++++++++++++++++++++++++++
> gdb/buildsym.h | 11 ++++
> gdb/dwarf2/read.c | 22 ++++---
> gdb/testsuite/gdb.cp/step-and-next-inline.exp | 17 ------
> 4 files changed, 108 insertions(+), 26 deletions(-)
>
> diff --git a/gdb/buildsym.c b/gdb/buildsym.c
> index 24aeba8..f02b7ce 100644
> --- a/gdb/buildsym.c
> +++ b/gdb/buildsym.c
> @@ -113,6 +113,8 @@ struct pending_block
> next1 = next->next;
> xfree ((void *) next);
> }
> +
> + xfree (m_inline_end_vector);
> }
>
> struct macro_table *
> @@ -732,6 +734,84 @@ struct blockvector *
> }
>
>
> +/* Record a PC where a inlined subroutine ends. */
> +
> +void
> +buildsym_compunit::record_inline_range_end (CORE_ADDR end)
> +{
> + if (m_inline_end_vector == nullptr)
> + {
> + m_inline_end_vector_length = INITIAL_LINE_VECTOR_LENGTH;
> + m_inline_end_vector = (CORE_ADDR *)
> + xmalloc (sizeof (CORE_ADDR) * m_inline_end_vector_length);
> + m_inline_end_vector_nitems = 0;
> + }
> + else if (m_inline_end_vector_nitems == m_inline_end_vector_length)
> + {
> + m_inline_end_vector_length *= 2;
> + m_inline_end_vector = (CORE_ADDR *)
> + xrealloc ((char *) m_inline_end_vector,
> + sizeof (CORE_ADDR) * m_inline_end_vector_length);
> + }
> +
> + m_inline_end_vector[m_inline_end_vector_nitems++] = end;
> +}
> +
> +
> +/* Patch the is_stmt bits at the given inline end address.
> + The line table has to be already sorted. */
> +
> +static void
> +patch_inline_end_pos (struct linetable *table, CORE_ADDR end)
> +{
> + int a = 2, b = table->nitems - 1;
> + struct linetable_entry *items = table->item;
> +
> + /* We need at least two items with pc = end in the table.
> + The lowest usable items are at pos 0 and 1, the highest
> + usable items are at pos b - 2 and b - 1. */
> + if (a > b || end < items[1].pc || end > items[b - 2].pc)
> + return;
> +
> + /* Look for the first item with pc > end in the range [a,b].
> + The previous element has pc = end or there is no match.
> + We set a = 2, since we need at least two consecutive elements
> + with pc = end to do anything useful.
> + We set b = nitems - 1, since we are not interested in the last
> + element which should be an end of sequence marker with line = 0
> + and is_stmt = 1. */
> + while (a < b)
> + {
> + int c = (a + b) / 2;
> +
> + if (end < items[c].pc)
> + b = c;
> + else
> + a = c + 1;
> + }
> +
> + a--;
> + if (items[a].pc != end || items[a].is_stmt)
> + return;
> +
> + /* When there is a sequence of line entries at the same address
> + where an inline range ends, and the last item has is_stmt = 0,
> + we force all previous items to have is_stmt = 0 as well.
> + Setting breakpoints at those addresses is currently not
> + supported, since it is unclear if the previous addresses are
> + part of the subroutine or the calling program. */
> + while (a-- > 0)
I know this is a personal preference, but I really dislike having
these inc/dec adjustments within expressions like this. I feel it's
too easy to overlook, and I always end up having to think twice just
to make sure I've parsed the intention correctly. I'd like to make a
plea to have the adjustment moved into the body of the loop please.
> + {
> + /* We stop at the first line entry with a different address,
> + or when we see an end of sequence marker. */
I think there's a whitespace error at the start of this line.
> + if (items[a].pc != end || items[a].line == 0)
> + break;
> +
> + items[a].is_stmt = 0;
> + }
> +}
> +
> +
> /* Subroutine of end_symtab to simplify it. Look for a subfile that
> matches the main source file's basename. If there is only one, and
> if the main source file doesn't have any symbol or line number
> @@ -965,6 +1045,10 @@ struct compunit_symtab *
> subfile->line_vector->item
> + subfile->line_vector->nitems,
> lte_is_less_than);
> +
> + for (int i = 0; i < m_inline_end_vector_nitems; i++)
> + patch_inline_end_pos (subfile->line_vector,
> + m_inline_end_vector[i]);
> }
>
> /* Allocate a symbol table if necessary. */
> diff --git a/gdb/buildsym.h b/gdb/buildsym.h
> index c768a4c..2845789 100644
> --- a/gdb/buildsym.h
> +++ b/gdb/buildsym.h
> @@ -190,6 +190,8 @@ struct buildsym_compunit
> void record_line (struct subfile *subfile, int line, CORE_ADDR pc,
> bool is_stmt);
>
> + void record_inline_range_end (CORE_ADDR end);
> +
> struct compunit_symtab *get_compunit_symtab ()
> {
> return m_compunit_symtab;
> @@ -397,6 +399,15 @@ struct buildsym_compunit
>
> /* Pending symbols that are local to the lexical context. */
> struct pending *m_local_symbols = nullptr;
> +
> + /* Pending inline end range addresses. */
> + CORE_ADDR * m_inline_end_vector = nullptr;
> +
> + /* Number of allocated inline end range addresses. */
> + int m_inline_end_vector_length = 0;
> +
> + /* Number of pending inline end range addresses. */
> + int m_inline_end_vector_nitems = 0;
> };
I think we should probably just use a std::vector<CORE_ADDR> for
m_inline_end_vector - unless I'm missing a good reason. I think this
would simplify some of the code in buildsym.c.
>
>
> diff --git a/gdb/dwarf2/read.c b/gdb/dwarf2/read.c
> index 1706b96..6be008c 100644
> --- a/gdb/dwarf2/read.c
> +++ b/gdb/dwarf2/read.c
> @@ -13609,10 +13609,6 @@ class process_die_scope
> return false;
> }
>
> - /* Empty range entries have no effect. */
> - if (range_beginning == range_end)
> - continue;
> -
> range_beginning += base;
> range_end += base;
>
> @@ -13723,10 +13719,6 @@ class process_die_scope
> return 0;
> }
>
> - /* Empty range entries have no effect. */
> - if (range_beginning == range_end)
> - continue;
> -
> range_beginning += base;
> range_end += base;
>
> @@ -13766,6 +13758,10 @@ class process_die_scope
> retval = dwarf2_ranges_process (offset, cu,
> [&] (CORE_ADDR range_beginning, CORE_ADDR range_end)
> {
> + /* Empty range entries have no effect. */
> + if (range_beginning == range_end)
> + return;
> +
> if (ranges_pst != NULL)
> {
> CORE_ADDR lowpc;
> @@ -14003,6 +13999,7 @@ class process_die_scope
> struct gdbarch *gdbarch = get_objfile_arch (objfile);
> struct attribute *attr;
> struct attribute *attr_high;
> + bool inlined_subroutine = (die->tag == DW_TAG_inlined_subroutine);
>
> attr_high = dwarf2_attr (die, DW_AT_high_pc, cu);
> if (attr_high)
> @@ -14018,7 +14015,10 @@ class process_die_scope
>
> low = gdbarch_adjust_dwarf2_addr (gdbarch, low + baseaddr);
> high = gdbarch_adjust_dwarf2_addr (gdbarch, high + baseaddr);
> - cu->get_builder ()->record_block_range (block, low, high - 1);
> + if (inlined_subroutine)
> + cu->get_builder ()->record_inline_range_end (high);
> + if (low < high)
> + cu->get_builder ()->record_block_range (block, low, high - 1);
> }
> }
>
> @@ -14043,6 +14043,10 @@ class process_die_scope
> end += baseaddr;
> start = gdbarch_adjust_dwarf2_addr (gdbarch, start);
> end = gdbarch_adjust_dwarf2_addr (gdbarch, end);
> + if (inlined_subroutine)
> + cu->get_builder ()->record_inline_range_end (end);
> + if (start == end)
> + return;
> cu->get_builder ()->record_block_range (block, start, end - 1);
> blockvec.emplace_back (start, end);
> });
> diff --git a/gdb/testsuite/gdb.cp/step-and-next-inline.exp b/gdb/testsuite/gdb.cp/step-and-next-inline.exp
> index acec48b..84c0901 100644
> --- a/gdb/testsuite/gdb.cp/step-and-next-inline.exp
> +++ b/gdb/testsuite/gdb.cp/step-and-next-inline.exp
> @@ -48,37 +48,20 @@ proc do_test { use_header } {
> gdb_test "step" ".*" "step into get_alias_set"
> gdb_test "bt" "\\s*\\#0\\s+get_alias_set\[^\r\]*${srcfile}:.*" \
> "not in inline 1"
> - # It's possible that this first failure (when not using a header
> - # file) is GCC's fault, though the remaining failures would best
> - # be fixed by adding location views support (though it could be
> - # that some easier heuristic could be figured out). Still, it is
> - # not certain that the first failure wouldn't also be fixed by
> - # having location view support, so for now it is tagged as such.
> - if {!$use_header} { setup_kfail "*-*-*" symtab/25507 }
> gdb_test "next" ".*TREE_TYPE.*" "next step 1"
> gdb_test "bt" "\\s*\\#0\\s+get_alias_set\[^\r\]*${srcfile}:.*" \
> "not in inline 2"
> gdb_test "next" ".*TREE_TYPE.*" "next step 2"
> gdb_test "bt" "\\s*\\#0\\s+get_alias_set\[^\r\]*${srcfile}:.*" \
> "not in inline 3"
> - if {!$use_header} { setup_kfail "*-*-*" symtab/25507 }
> gdb_test "next" ".*TREE_TYPE.*" "next step 3"
> gdb_test "bt" "\\s*\\#0\\s+get_alias_set\[^\r\]*${srcfile}:.*" \
> "not in inline 4"
> - if {!$use_header} { setup_kfail "*-*-*" symtab/25507 }
> gdb_test "next" "return 0.*" "next step 4"
> gdb_test "bt" \
> "\\s*\\#0\\s+(main|get_alias_set)\[^\r\]*${srcfile}:.*" \
> "not in inline 5"
>
> - if {!$use_header} {
> - # With the debug from GCC 10.x (and earlier) GDB is currently
> - # unable to successfully complete the following tests when we
> - # are not using a header file.
> - kfail symtab/25507 "stepping tests"
> - return
> - }
> -
> clean_restart ${executable}
>
> if ![runto_main] {
> --
> 1.9.1
>
More information about the Gdb-patches
mailing list