[PATCHv5] Fix range end handling of inlined subroutines
Bernd Edlinger
bernd.edlinger@hotmail.de
Sat Apr 4 23:59:03 GMT 2020
On 4/5/20 12:07 AM, Andrew Burgess wrote:
> * Bernd Edlinger <bernd.edlinger@hotmail.de> [2020-04-04 21:50:21 +0200]:
>
>> Hi,
>>
>> this is an updated version of my patch that was originally only intended
>> to fix the issues with inline functions in the same file. That needed
>> re-basing anyway, because of a merge conflict.
>>
>> I removed the code that does the special handling of end sequence
>> markers in record_line now, since it seems to cause more problems than
>> it solves.
>>
>> I believe it will fix the regression that Louis pointed out,
>
> Do you have a link or any extra information for this? I'd like to
> understand what the actual regression was. I noticed there didn't
> seem to be any extra tests added over previous versions of the patch,
> is this regression covered by an existing test?
>
Yup. Please have a look at
https://marc.info/?l=gdb-patches&m=158601805302656&w=2
Here we see the beginning of main i assume:
[0x000000b4] Extended opcode 2: set Address to 0x73c
[0x000000bf] Advance Line by 75 to 76
[0x000000c2] Copy
And somewere else we see the end of another function:
[0x00000129] Extended opcode 2: set Address to 0x73c
[0x00000134] Advance Line by 1 to 73
[0x00000136] Copy
[0x00000137] Extended opcode 1: End of Sequence
So by the stable sorting of the line table,
we should have
line 76 is-stmt
line 73 !is_stmt
line 0 is_stmt
which is the state, what is in master where the test fail for Luis
but the step-over-test case works.
with my update patch of today
we should have
line 76 is-stmt
lint 73 is-stmt
line 0 is-stmt
which also fails for Luis, and has 6 additional failures.
but the step-over-test case works.
So I start to think there is no way how I can leave line 73
in the line table, I must delete it, when there is an explicit
end of sequence. But not when there is no explicit end.
Previously we had
line 76 is-stmt < from main
line 0 is-stmt < end of another function
which works, but is also not very nice.
I would like to remove the lines at a true end of sequence,
but not remove lines at a fake end of sequence that we emit
when the cu changes, the idea, that I have right now is that
is pass line = -1 to the record_line and do not touch any
lines at that PC, change that to line = 0 internally, record
that. so the line table looks like it was always, since
I cannot tell what breaks when that end sequence is not there.
Bernd.
>> and
>> should fix the regression that Andrew wanted to fix with his
>> patch:
>
>>
>> [PATCH 2/2] gdb: Preserve is-stmt lines when switch between files
>> https://marc.info/?l=gdb-patches&m=158595247916817&w=2
>>
>> So I hope that will not be necessary after this.
>
> It would be nice if you picked up the extra tests from that patch and
> included them here if this is a replacement solution.
>
>>
>> There is a theoretic issue with line numbers at the end
>> of a function, these could coincidentally have the same
>> PC as the following function. That might need more work
>> to solve that, in the moment I have not yet looked at
>> tracking that down.
>>
>>
>> Thanks
>> Bernd.
>
>> From 330eadf4b42e44bfa82c30a6bda21393fa4a54c8 Mon Sep 17 00:00:00 2001
>> From: Bernd Edlinger <bernd.edlinger@hotmail.de>
>> Date: Sun, 9 Feb 2020 21:13:17 +0100
>> Subject: [PATCH] Fix range end handling of inlined subroutines
>>
>> Since the is_stmt is now handled, it becomes
>> possible to locate dubious is_stmt line entries
>> at the end of an inlined function, even if the
>> called inline function is in the same subfile.
>>
>> When there is a sequence of line entries at the
>> same address where an inline range ends, and the
>> last item has is_stmt = 0, we force all previous
>> items to have is_stmt = 0 as well.
>>
>> If the last line at that address has is_stmt = 1,
>> there is no need to change anything, since a step
>> over will always stop at that last line from the
>> same address, which is fine, since it is outside
>> the subroutine.
>>
>> With this change we loose the ability to set
>> breakpoints on some lines using file:line syntax,
>> but the data is not completely lost, as the
>> line table is still holding those lines, just
>> with the is_stmt flag reset.
>
> This is a trade off that I don't think we should breeze over quite so
> easily. Many, many users make use of an IDE to drive GDB, and in
> those cases they mostly use file:line syntax exclusively, so any loss
> in this area is something we should consider seriously.
>
>>
>> This is necessary as breakpoints on these lines
>> are problematic, as the call stack is often
>> wrong.
>
> Maybe. Yes I absolutely agree that in the test case, when we stop at
> these line numbers, the call stack is wrong. But, as GDB developers,
> we can have a slightly different take on things. In this particular
> case the debug information we're supplied is just plain wrong. This
> is documented in this GCC bug:
>
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94474
>
> ( Bernd, I know you've already seen that, I include the link for anyone else
> following this thread. )
>
> The lines that are part of the inline subroutine, in this test are
> marked as is-stmt _and_ are outside the range of the inline
> subroutine. So the call stack GDB presents is not wrong w.r.t the
> debug information we're given, it's the debug information that is
> wrong in this case.
>
> Now, there are many places where GDB tries to work around broken debug
> information, but I think that in this case we give up too much. I
> would prefer to see if GCC can fix their DWARF creation first before
> we take something like this.
>
>> From the block info they appear to be
>> located in the caller, but they are actually meant
>> to be part of the subroutine, therefore usually one
>> frame is missing from the callstack when the
>> execution stops there.
>
> The lines shouldn't be marked as is-stmt in the location they are, or
> GCC should extend the range slightly.
>
>>
>> This is about the best we can do at the moment,
>> unless location view information are added to the
>> block ranges debug info structure, and location
>> views are implemented in gdb in general.
>
> Please do correct me if I'm wrong, but I don't think GCC can yet
> produce view information for range endings, right?
>
> I agree that the best solution here would be to have location views
> for range start (GCC can do this) and range end (I think this is
> missing), then scopes can merge over within a single address.
>
>>
>> gdb:
>> 2020-04-04 Bernd Edlinger <bernd.edlinger@hotmail.de>
>>
>> * buildsym.c (buildsym_compunit::record_line): Remove line deletion
>> at end sequence marker.
>> (buildsym_compunit::record_inline_range_end,
>> patch_inline_end_pos): New helper functions.
>> (buildsym_compunit::end_symtab_with_blockvector): Patch line table.
>> (buildsym_compunit::~buildsym_compunit): Cleanup m_inline_end_vector.
>> * buildsym.h (buildsym_compunit::record_inline_range_end): Declare.
>> (buildsym_compunit::m_inline_end_vector,
>> buildsym_compunit::m_inline_end_vector_length,
>> buildsym_compunit::m_inline_end_vector_nitems): New data items.
>> * dwarf2/read.c (dwarf2_rnglists_process,
>> dwarf2_ranges_process): Don't ignore empty ranges here.
>> (dwarf2_ranges_read): Ignore empty ranges here.
>> (dwarf2_record_block_ranges): Pass end of range PC to
>> record_inline_range_end for inline functions.
>>
>> gdb/testsuite:
>> 2020-04-04 Bernd Edlinger <bernd.edlinger@hotmail.de>
>>
>> * gdb.cp/step-and-next-inline.exp: Adjust test.
>> ---
>> gdb/buildsym.c | 115 ++++++++++++++++++++------
>> gdb/buildsym.h | 11 +++
>> gdb/dwarf2/read.c | 22 +++--
>> gdb/testsuite/gdb.cp/step-and-next-inline.exp | 17 ----
>> 4 files changed, 114 insertions(+), 51 deletions(-)
>>
>> diff --git a/gdb/buildsym.c b/gdb/buildsym.c
>> index fe07103..e6e7437 100644
>> --- a/gdb/buildsym.c
>> +++ b/gdb/buildsym.c
>> @@ -113,6 +113,8 @@ struct pending_block
>> next1 = next->next;
>> xfree ((void *) next);
>> }
>> +
>> + xfree (m_inline_end_vector);
>> }
>>
>> struct macro_table *
>> @@ -691,31 +693,6 @@ struct blockvector *
>> * sizeof (struct linetable_entry))));
>> }
>>
>> - /* The end of sequence marker is special. We need to reset the
>> - is_stmt flag on previous lines at the same PC, otherwise these
>> - lines may cause problems since they might be at the same address
>> - as the following function. For instance suppose a function calls
>> - abort there is no reason to emit a ret after that point (no joke).
>> - So the label may be at the same address where the following
>> - function begins. A similar problem appears if a label is at the
>> - same address where an inline function ends we cannot reliably tell
>> - if this is considered part of the inline function or the calling
>> - program or even the next inline function, so stack traces may
>> - give surprising results. Expect gdb.cp/step-and-next-inline.exp
>> - to fail if these lines are not modified here. */
>> - if (line == 0 && subfile->line_vector->nitems > 0)
>> - {
>> - e = subfile->line_vector->item + subfile->line_vector->nitems;
>> - do
>> - {
>> - e--;
>> - if (e->pc != pc || e->line == 0)
>> - break;
>> - e->is_stmt = 0;
>> - }
>> - while (e > subfile->line_vector->item);
>> - }
>> -
>
> I'm slightly nervous with this proposed change. Code like this
> existed before we started messing with is-stmt support, and I suspect
> it probably dates from a time before testing was as good as it is
> now. I assume nothing regresses with this removed, but it would be
> nice to try a few experiments I think with different combinations of
> two files, empty line ranges, and having is-stmt before and after the
> transition between files, just to confirm that we really don't need
> this code any more. I suspect this is the "line numbers at the end of
> a function" issue you mentioned in your introductory text.
>
> That said, I see how removing this provides an alternative solution to
> the patch I posted.
>
>> e = subfile->line_vector->item + subfile->line_vector->nitems++;
>> e->line = line;
>> e->is_stmt = is_stmt ? 1 : 0;
>> @@ -723,6 +700,90 @@ struct blockvector *
>> }
>>
>>
>> +/* Record a PC where a inlined subroutine ends. */
>> +
>> +void
>> +buildsym_compunit::record_inline_range_end (CORE_ADDR end)
>> +{
>> + /* The performance of this function is very important,
>> + it shall be O(n*log(n)) therefore we do not use std::vector
>> + here since some compilers, e.g. visual studio, do not
>> + guarantee that for vector::push_back. */
>
> I think we're going to need more of a group discussion on this.
> Simply saying std::vector is too slow doesn't I'm afraid convince me.
> There seem to be lots of other places in GDB where both performance is
> critical, and we use ::push_back.
>
> Is your claim that we should move away from std::vector in all these
> cases? Is this case somehow special?
>
> Obviously I've made this point before, and you're sticking to your
> position, which is fine, but I can't approve this patch without seeing
> a more compelling argument against std::vector. That said, if some
> other maintainer is convinced, they're welcome to approve it. Either
> way, the comment would be better placed at the declaration of the
> variable, rather than here I think.
>
>> + if (m_inline_end_vector == nullptr)
>> + {
>> + m_inline_end_vector_length = INITIAL_LINE_VECTOR_LENGTH;
>> + m_inline_end_vector = (CORE_ADDR *)
>> + xmalloc (sizeof (CORE_ADDR) * m_inline_end_vector_length);
>> + m_inline_end_vector_nitems = 0;
>> + }
>> + else if (m_inline_end_vector_nitems == m_inline_end_vector_length)
>> + {
>> + m_inline_end_vector_length *= 2;
>> + m_inline_end_vector = (CORE_ADDR *)
>> + xrealloc ((char *) m_inline_end_vector,
>> + sizeof (CORE_ADDR) * m_inline_end_vector_length);
>> + }
>> +
>> + m_inline_end_vector[m_inline_end_vector_nitems++] = end;
>> +}
>> +
>> +
>> +/* Patch the is_stmt bits at the given inline end address.
>> + The line table has to be already sorted. */
>> +
>> +static void
>> +patch_inline_end_pos (struct linetable *table, CORE_ADDR end)
>> +{
>> + int a = 2, b = table->nitems - 1;
>> + struct linetable_entry *items = table->item;
>> +
>> + /* We need at least two items with pc = end in the table.
>> + The lowest usable items are at pos 0 and 1, the highest
>> + usable items are at pos b - 2 and b - 1. */
>> + if (a > b || end < items[1].pc || end > items[b - 2].pc)
>> + return;
>> +
>> + /* Look for the first item with pc > end in the range [a,b].
>> + The previous element has pc = end or there is no match.
>> + We set a = 2, since we need at least two consecutive elements
>> + with pc = end to do anything useful.
>> + We set b = nitems - 1, since we are not interested in the last
>> + element which should be an end of sequence marker with line = 0
>> + and is_stmt = 1. */
>> + while (a < b)
>> + {
>> + int c = (a + b) / 2;
>> +
>> + if (end < items[c].pc)
>> + b = c;
>> + else
>> + a = c + 1;
>> + }
>> +
>> + a--;
>> + if (items[a].pc != end || items[a].is_stmt)
>> + return;
>> +
>> + /* When there is a sequence of line entries at the same address
>> + where an inline range ends, and the last item has is_stmt = 0,
>> + we force all previous items to have is_stmt = 0 as well.
>> + Setting breakpoints at those addresses is currently not
>> + supported, since it is unclear if the previous addresses are
>> + part of the subroutine or the calling program. */
>> + do
>> + {
>> + /* We stop at the first line entry with a different address,
>> + or when we see an end of sequence marker. */
>> + a--;
>> + if (items[a].pc != end || items[a].line == 0)
>> + break;
>> +
>> + items[a].is_stmt = 0;
>> + }
>> + while (a > 0);
>> +}
>> +
>> +
>> /* Subroutine of end_symtab to simplify it. Look for a subfile that
>> matches the main source file's basename. If there is only one, and
>> if the main source file doesn't have any symbol or line number
>> @@ -956,6 +1017,10 @@ struct compunit_symtab *
>> subfile->line_vector->item
>> + subfile->line_vector->nitems,
>> lte_is_less_than);
>> +
>> + for (int i = 0; i < m_inline_end_vector_nitems; i++)
>> + patch_inline_end_pos (subfile->line_vector,
>> + m_inline_end_vector[i]);
>> }
>>
>> /* Allocate a symbol table if necessary. */
>> diff --git a/gdb/buildsym.h b/gdb/buildsym.h
>> index c768a4c..2845789 100644
>> --- a/gdb/buildsym.h
>> +++ b/gdb/buildsym.h
>> @@ -190,6 +190,8 @@ struct buildsym_compunit
>> void record_line (struct subfile *subfile, int line, CORE_ADDR pc,
>> bool is_stmt);
>>
>> + void record_inline_range_end (CORE_ADDR end);
>> +
>> struct compunit_symtab *get_compunit_symtab ()
>> {
>> return m_compunit_symtab;
>> @@ -397,6 +399,15 @@ struct buildsym_compunit
>>
>> /* Pending symbols that are local to the lexical context. */
>> struct pending *m_local_symbols = nullptr;
>> +
>> + /* Pending inline end range addresses. */
>> + CORE_ADDR * m_inline_end_vector = nullptr;
>> +
>> + /* Number of allocated inline end range addresses. */
>> + int m_inline_end_vector_length = 0;
>> +
>> + /* Number of pending inline end range addresses. */
>> + int m_inline_end_vector_nitems = 0;
>> };
>>
>>
>> diff --git a/gdb/dwarf2/read.c b/gdb/dwarf2/read.c
>> index bcc3116..321de93 100644
>> --- a/gdb/dwarf2/read.c
>> +++ b/gdb/dwarf2/read.c
>> @@ -13527,10 +13527,6 @@ class process_die_scope
>> return false;
>> }
>>
>> - /* Empty range entries have no effect. */
>> - if (range_beginning == range_end)
>> - continue;
>
> I don't think we should be doing this. This is defined quite clearly
> in the DWARF spec as being an empty range. No code is associated with
> this range. As such, it really shouldn't have an impact on how we
> interpret the rest of the DWARF.
>
> Again, I think you're trying too hard to work around GCC's broken
> DWARF output.
>
>> -
>> range_beginning += *base;
>> range_end += *base;
>>
>> @@ -13638,10 +13634,6 @@ class process_die_scope
>> return 0;
>> }
>>
>> - /* Empty range entries have no effect. */
>> - if (range_beginning == range_end)
>> - continue;
>
> Same.
>
>> -
>> range_beginning += *base;
>> range_end += *base;
>>
>> @@ -13681,6 +13673,10 @@ class process_die_scope
>> retval = dwarf2_ranges_process (offset, cu,
>> [&] (CORE_ADDR range_beginning, CORE_ADDR range_end)
>> {
>> + /* Empty range entries have no effect. */
>> + if (range_beginning == range_end)
>> + return;
>
> Same.
>
>> +
>> if (ranges_pst != NULL)
>> {
>> CORE_ADDR lowpc;
>> @@ -13918,6 +13914,7 @@ class process_die_scope
>> struct gdbarch *gdbarch = get_objfile_arch (objfile);
>> struct attribute *attr;
>> struct attribute *attr_high;
>> + bool inlined_subroutine = (die->tag == DW_TAG_inlined_subroutine);
>>
>> attr_high = dwarf2_attr (die, DW_AT_high_pc, cu);
>> if (attr_high)
>> @@ -13933,7 +13930,10 @@ class process_die_scope
>>
>> low = gdbarch_adjust_dwarf2_addr (gdbarch, low + baseaddr);
>> high = gdbarch_adjust_dwarf2_addr (gdbarch, high + baseaddr);
>> - cu->get_builder ()->record_block_range (block, low, high - 1);
>> + if (inlined_subroutine)
>> + cu->get_builder ()->record_inline_range_end (high);
>> + if (low < high)
>> + cu->get_builder ()->record_block_range (block, low, high - 1);
>> }
>> }
>>
>> @@ -13958,6 +13958,10 @@ class process_die_scope
>> end += baseaddr;
>> start = gdbarch_adjust_dwarf2_addr (gdbarch, start);
>> end = gdbarch_adjust_dwarf2_addr (gdbarch, end);
>> + if (inlined_subroutine)
>> + cu->get_builder ()->record_inline_range_end (end);
>> + if (start == end)
>> + return;
>> cu->get_builder ()->record_block_range (block, start, end - 1);
>> blockvec.emplace_back (start, end);
>> });
>> diff --git a/gdb/testsuite/gdb.cp/step-and-next-inline.exp b/gdb/testsuite/gdb.cp/step-and-next-inline.exp
>> index 3733fa7..e3a5793 100644
>> --- a/gdb/testsuite/gdb.cp/step-and-next-inline.exp
>> +++ b/gdb/testsuite/gdb.cp/step-and-next-inline.exp
>> @@ -52,37 +52,20 @@ proc do_test { use_header } {
>> gdb_test "step" ".*" "step into get_alias_set"
>> gdb_test "bt" "\\s*\\#0\\s+get_alias_set\[^\r\]*${srcfile}:.*" \
>> "not in inline 1"
>> - # It's possible that this first failure (when not using a header
>> - # file) is GCC's fault, though the remaining failures would best
>> - # be fixed by adding location views support (though it could be
>> - # that some easier heuristic could be figured out). Still, it is
>> - # not certain that the first failure wouldn't also be fixed by
>> - # having location view support, so for now it is tagged as such.
>> - if {!$use_header} { setup_kfail "*-*-*" symtab/25507 }
>> gdb_test "next" ".*TREE_TYPE.*" "next step 1"
>> gdb_test "bt" "\\s*\\#0\\s+get_alias_set\[^\r\]*${srcfile}:.*" \
>> "not in inline 2"
>> gdb_test "next" ".*TREE_TYPE.*" "next step 2"
>> gdb_test "bt" "\\s*\\#0\\s+get_alias_set\[^\r\]*${srcfile}:.*" \
>> "not in inline 3"
>> - if {!$use_header} { setup_kfail "*-*-*" symtab/25507 }
>> gdb_test "next" ".*TREE_TYPE.*" "next step 3"
>> gdb_test "bt" "\\s*\\#0\\s+get_alias_set\[^\r\]*${srcfile}:.*" \
>> "not in inline 4"
>> - if {!$use_header} { setup_kfail "*-*-*" symtab/25507 }
>> gdb_test "next" "return 0.*" "next step 4"
>> gdb_test "bt" \
>> "\\s*\\#0\\s+(main|get_alias_set)\[^\r\]*${srcfile}:.*" \
>> "not in inline 5"
>>
>> - if {!$use_header} {
>> - # With the debug from GCC 10.x (and earlier) GDB is currently
>> - # unable to successfully complete the following tests when we
>> - # are not using a header file.
>> - kfail symtab/25507 "stepping tests"
>> - return
>> - }
>> -
>
> In summary:
>
> If we can convince ourselves that the removal of the special END
> handling block really is OK, then I'm happy for _that_ part of this
> patch to replace my proposed patch. Does this part have to be bundled
> with the second part?
>
> I don't think the rest of the patch should be merged at the moment. I
> think we should wait to see if GCC can fix their broken DWARF
> production. If this issue is resolved, then possibly we could
> maintain this, or some variation of this guarded with a producer
> check, so only known bad versions of GCC are modified. But I don't
> think we should assume that all producers get their range information
> wrong.
>
> Thanks,
> Andrew
>
More information about the Gdb-patches
mailing list