[PATCH] gdb: improve line number lookup around inline functions
Andrew Burgess
aburgess@redhat.com
Wed Dec 4 15:10:07 GMT 2024
Bernd,
Thanks for reviewing this patch. I'm about to post a 2 patch series
which includes an updated version of this patch. I'll link from the new
post to this series, but I wanted to follow up on your review feedback
here so you knew what I've changed, and what I haven't.
Bernd Edlinger <bernd.edlinger@hotmail.de> writes:
> Hi Andrew,
>
> thanks for working on this.
>
> On 7/26/24 17:32, Andrew Burgess wrote:
>> I wrote this patch while reviewing the following patches:
>>
>> https://inbox.sourceware.org/gdb-patches/AS8P193MB1285C58F6F09502252CEC16FE4DF2@AS8P193MB1285.EURP193.PROD.OUTLOOK.COM
>> https://inbox.sourceware.org/gdb-patches/AS8P193MB12855708DFF59A5309F5B19EE4DF2@AS8P193MB1285.EURP193.PROD.OUTLOOK.COM
>>
>> These two patches fix, as far as I can tell, 3 different issues
>> related to debugging of optimised code around inline functions.
>>
>> One of the issues fixed is having GDB display the correct line number
>> (and corresponding source line) when a frame other than #0 contains an
>> inline function, in some cases this can go wrong.
>>
>> I think that there's a simple fix for this issue that doesn't require
>> all the infrastructure introduced in the two patches linked above, and
>> this is that fix.
>>
>> Rather than use the tests from the above patches, which cover all 3 of
>> the issues that the above patches address, this commit has tests that
>> focus only on the one issue being fixed here.
>>
>> All feedback welcome,
>>
>> Thanks,
>> Andrew
>>
>> ---
>>
>> This commit aims to fix an issue where GDB would report the wrong line
>> for frames other than #0 if a previous frame had just left an inline
>> function.
>>
>> Consider this example which is compiled at -Og:
>>
>> volatile int global = 0;
>>
>> static inline int bar (void) { return 1; }
>>
>> static void foo (int count)
>> { global += count; }
>>
>> int main (void)
>> {
>> foo (bar ());
>> return 0;
>> }
>>
>
> Which gcc version were you using?
>
> Initially I had not been able to reproduce the reported issue
> with this test case. Since I used gcc-12 + gcc-14.
>
> Then I realized that there was something wrong with the debug
> info in gcc-9 + gcc-10 but that was fixed in gcc-11 and following.
>
> The problem is that the inline function is completely optimized
> away, so there is no DW_TAG_inlined_subroutine in the debug info
> but the line numbers from the subroutine are still there, and make
> the problems, however the problem is not solved by your patch.
> Consider this command:
>
> $ gcc -g -Og -fcf-protection=none test.c
> $ gdb a.out
> (gdb) b main
> Breakpoint 1 at 0x113a: file test.c, line 3.
> (gdb) r
> Breakpoint 1, main () at test.c:3
> 3 static inline int bar (void) { return 1; }
> (gdb) s
> foo (count=count@entry=1) at test.c:6
> 6 { global += count; }
> (gdb) bt
> #0 foo (count=count@entry=1) at test.c:6
> #1 0x0000555555555144 in main () at test.c:10
>
>
> So, I have built gdb with your patch, and as you can see, the line table
> is completely bogus, and can not work properly. But why does this
> no longer happen with gcc-11 and following?
>
> Well, the reason is that I fixed this bug already some years ago:
>
> commit 2e6562043c48c0ae6bc9823d438685269eb11aab (HEAD, refs/bisect/new)
> Author: Bernd Edlinger <bernd.edlinger@hotmail.de>
> Date: Mon Dec 7 12:00:00 2020 +0100
>
> Remove misleading debug line entries
>
> This removes gimple_debug_begin_stmts without block info which remain
> after a gimple block originating from an inline function is unused.
>
> The line numbers from these stmts are from the inline function,
> but since the inline function is completely optimized away,
> there will be no DW_TAG_inlined_subroutine so the debugger has
> no callstack available at this point, and therefore those
> line table entries are not helpful to the user.
>
> 2020-12-10 Bernd Edlinger <bernd.edlinger@hotmail.de>
>
> * cfgexpand.c (expand_gimple_basic_block): Remove special handling
> of debug_inline_entries without block info.
> * tree-inline.c (remap_gimple_stmt): Drop debug_nonbind_markers when
> the call statement has no block info.
> (copy_debug_stmt): Remove debug_nonbind_markers when inlining
> and the block info is mapped to NULL.
> * tree-ssa-live.c (clear_unused_block_pointer): Remove
> debug_nonbind_markers originating from removed inline functions.
>
> This is from the gcc git repository, this commit landed in gcc-11 but I
> never considered it appropriate for a back-port to earlier stable branches.
> So with gcc built from this revision onwards, the line table does simply
> no longer contain the line 3 in above test example.
Thanks for digging into this so thoroughly. You are correct that I was
likely doing insufficient testing originally, and using a gcc 9.x
compiler only.
I've got my act in gear, and I'm now testing with 8.0, 8.4.0, 9.3.1,
9.5.0, 10.5.0, 11.5.0, 12.2.0, 13.3.0, and 14.2.0.
This patch does still fix the issue it claims to, even with gcc 14.2.0
(and any post gcc-11 release), we just need to nudge the compiler to
generate the problem code. The change I needed was actually in the test
added in your optimised code branch, gdb.base/empty-inline.c.
In this test the inline function is:
int __attribute__((noinline, noclone))
#ifdef __CET__
__attribute__((nocf_check))
#endif
test1 (int x)
{
asm ("");
return x+1; /* line 31 */
}
I started from here and removed things while the problem still existed,
but due to my limited compiler choice, I ended up deleting the
'asm ("")'. If I keep this then the problem can be reproduced on later
versions of gcc.
> So in comparison to my patch I found this attempt to be more radical,
> and still incomplete as this little experiment shows.
I _think_ you are talking about two different things here. The fix
itself you find radical, but the testing you find incomplete. My next
iteration leaves the fix unchanged, I believe that the argument for that
change is still sound. But the testing has been improved.
> Admittedly my
> patch focuses only on inline-functions that do have block-ranges, and therefore
> would not change the result with a debug-info that is simply inconsistent
> as in this example.
I'm not sure exactly what you are getting at here, but my position is
that even this original version included a C code example (though I
admit that this example depended on using an older gcc version), but the
revised patch I am about to post is updated with a better C code
example. Nothing in the fix presented here depends on inconsistent
DWARF that can only be generated via the DWARF assembler.
That said, there is a discussion to be had about how accurate DWARF
assembler based tests need to be in reflecting actual DWARF which has
been seen in the wild. Sometimes it is OK to focus only on those parts
of the DWARF which are strictly necessary. But other times we do need
to flesh out more of the DWARF. Within reason I'm happy to add more if
you feel doing so is important for an accurate test.
> Regarding the test case there I noticed that it does something that gcc never
> does, namely to not emit a weak line table entry at the end of the inline
> function.
>
>> 7 10 0x0000000000401117 0x0000000000401117 Y
>> 8 3 0x0000000000401117 0x0000000000401117 Y
>> 9 10 0x0000000000401117 0x0000000000401117
>> 10 11 0x0000000000401121 0x0000000000401121 Y
>
> so at the end of the inline function's block-range we always have the
> weak line number from the calling function, that is what find_pc_sect_line
> returns with my patch, but the test case does not emit any line table entries
> at the end of the inline range, therefore the test case is failed with
> my patch, which I think is a bug in the test itself, I can make it
> passed if apply this change:
I believe this is addressed in the updated patch.
> So the test case would be okay with this change applied, but please
> do not fix the find_pc_sect_line in this way, as it would have an effect
> not only on inline functions,
I don't see this as a problem. I don't believe the existing algorithm
is correct, and I think it should be fixed. I don't have any examples
where this change impacts non-inline functions though. Do you have an
example for a case which might be negatively impacted by this change?
> and it would of course conflict with my
> patch,
The good news is that there is no conflict. I pushed a sourceware
branch users/aburgess/gdb-opt-code-debug where your work is merged on
top of this patch and there was no conflict, and all of your tests are
still passing.
> while the issue was resolved by my earlier gcc-patch already 4
> years ago.
The specific _test_ that I included was resolved by changes to gcc 4
years ago. But by using the test from your series the issue is still
present, we just need to work harder to trigger it now.
Thanks,
Andrew
More information about the Gdb-patches
mailing list