Summary: | -exec-next fails in mingw (infrun.c:2794: internal-error: resume_1: Assertion `pc_in_thread_step_range (pc, tp)' failed) | ||
---|---|---|---|
Product: | gdb | Reporter: | Dmitry Neverov <dmitry.neverov> |
Component: | gdb | Assignee: | Not yet assigned to anyone <unassigned> |
Status: | RESOLVED FIXED | ||
Severity: | normal | CC: | brobecker, simon.marchi, ssbssa, tromey |
Priority: | P2 | ||
Version: | HEAD | ||
Target Milestone: | 15.2 | ||
Host: | Target: | ||
Build: | Last reconfirmed: | 2024-06-05 00:00:00 | |
Attachments: |
gdb-13.2 stepping log (works fine)
gdb-14.2 stepping log (crashes) |
Description
Dmitry Neverov
2024-05-10 15:50:54 UTC
Gdb runs on x86_64-w64-mingw32 and attaches to a remote aarch64 target. If you just start gdb with your executable, and do 'info line TP_5_3_2Character.cpp:106', do you get the same warnings? And would it be possible to share that executable (or some other simple reproducer)? When I run 'info line' after loading the binary, there is no warning in 14.2 and master:
(gdb) info line TP_5_3_2Character.cpp:106
Line 106 of "../../Samples/Games/TP/Source/TP_5_3_2\TP_5_3_2Character.cpp"
starts at address 0xfac6dac <_ZN18ATP_5_3_2Character4MoveERK17FInputActionValue+268>
and ends at 0xfac6dc8 <_ZN18ATP_5_3_2Character4MoveERK17FInputActionValue+296>.
(gdb)
> And would it be possible to share that executable (or some other simple reproducer)?
The executable where I get an error is larger than 2gb, not sure I can share it. I was trying to reproduce it on smaller program with no success so far. Maybe I can run more commands on the binary and report the results?
You could run with "set debug infrun on" and attach the logs. Also, a backtrace at the point of the crash would be useful. > If I run 'info line' before -exec-next, in 13.2 it outputs: > > (gdb) info line > Line 106 of "../../Samples/Games/TP/Source/TP_5_3_2\TP_5_3_2Character.cpp" > starts at address 0x17ad3dac <ATP_5_3_2Character::Move(FInputActionValue > const&)+268> and ends at 0x17ad3dc8 > <ATP_5_3_2Character::Move(FInputActionValue const&)+296>. > > In gdb 14.x and HEAD: > > (gdb) info line > warning: (Internal error: pc 0x800cffe in read in CU, but not in symtab.) > warning: (Error: pc 0x800cffe in address map, but not in symtab.) > Line 106 of "../../Samples/Games/TP/Source/TP_5_3_2\TP_5_3_2Character.cpp" > starts at address 0x17ad3dac <ATP_5_3_2Character::Move(FInputActionValue > const&)+268> and ends at 0x800cffe. It's hard to tell if it's related to the -exec-next or not. You could perhaps file a separate bug for this one to get more visibility, and if we realize they are related we can link the two. Since it looks like a clear regression, you could try to bisect to see which commit introduced the bug. After that, it's easier to poke the author of the commit to see if they can have a look. > > In HEAD -exec-next fails a little bit differently: > > >47-exec-next --thread 2 > <^done > <(gdb) > <47^running > <*running,thread-id="all" > <47^error,msg="Protocol error: QThreadEvents (thread-events) conflicting > enabled responses." > <(gdb) This sounds like another bug, related to: https://sourceware.org/git/?p=binutils-gdb.git;a=commit;h=65c459abebf70bd5a64dcee11d4d7d4a8498465f I think it would be worth filing a separate bug for this one, with some details about how you reproduce: what is the server (gdbserver or other), what version, logs with "set debug remote 1" enabled. Created attachment 15529 [details]
gdb-13.2 stepping log (works fine)
Created attachment 15530 [details]
gdb-14.2 stepping log (crashes)
> You could run with "set debug infrun on" and attach the logs. Also, a backtrace > at the point of the crash would be useful.
Attached. The backtrace on crash in 14.2 is:
resume_1 infrun.c:2794
resume infrun.c:2810
keep_going_pass_signal infrun.c:8557
keep_going infrun.c:8576
process_event_stop_test infrun.c:7752
handle_signal_stop infrun.c:6886
handle_inferior_event infrun.c:6114
fetch_inferior_event infrun.c:4466
inferior_event_handler inf-loop.c:42
remote_async_serial_handler remote.c:14859
run_async_handler_and_reschedule ser-base.c:138
fd_event ser-base.c:189
handle_file_event event-loop.cc:573
gdb_wait_for_event event-loop.cc:716
gdb_do_one_event event-loop.cc:264
start_event_loop main.c:407
captured_command_loop main.c:471
captured_main main.c:1324
gdb_main main.c:1343
main gdb.c:39
Judging by bisect, it was introduced by commit 1acc9dca423f78e44553928f0de839b618c13766 Author: Tom Tromey <tom@tromey.com> Date: Tue Mar 7 17:37:45 2023 -0700 Change linetables to be objfile-independent > I think it would be worth filing a separate bug for this one Filed https://sourceware.org/bugzilla/show_bug.cgi?id=31801 > warning: (Internal error: pc 0x800cffe in read in CU, but not in symtab.)
> warning: (Error: pc 0x800cffe in address map, but not in symtab.)
This is definitely a red flag fwiw.
Normally it means the DWARF indexer is out of sync
with the full reader somehow.
I wonder if this could possibly be tripping over this: https://inbox.sourceware.org/gdb-patches/20240217-dwarf-race-relocate-v1-7-d3d2d908c1e8@tromey.com/ Just a wild guess, since that's the only issue I've run across in this area lately. Otherwise I guess we'll need some way to reproduce & then debug gdb. > Line 106 of "../../Samples/Games/TP/Source/TP_5_3_2\TP_5_3_2Character.cpp" starts at address 0x17ad3dac <ATP_5_3_2Character::Move(FInputActionValue const&)+268> and ends at 0x800cffe.
One thing I notice here is that the start address is
relocated but the end address is not.
That seems extremely peculiar.
I wonder if you could set a breakpoint in find_line_pc_range
and see what's going wrong here.
Looking at find_pc_sect_line I don't really see how
it could happen.
I've added breakpoins in find_line_pc_range, but they are not triggered. I've debugged find_pc_sect_line and noticed 2 things. 1. line table contains entries with address 0xFFFFFFFFFFFFFFFE (-2). lnp_state_machine::check_line_address() checks address -1 and mentions https://reviews.llvm.org/D81784 in a8caed5d7faa639a1e6769eba551d15d8ddd9510. It looks like lld used value -2 for pre-DWARF-v5: https://github.com/llvm/llvm-project/commit/e618ccbf431f6730edb6d1467a127c3a52fd57f7#diff-7d58449b03500d25cfeb298e5b0591bba14e8fbcf5bfb899d20dfb8007f38854 It doesn't do that any more: https://github.com/llvm/llvm-project/commit/004be4037e1e9c6092323c5c9268acb3ecf9176c Maybe lnp_state_machine::check_line_address should check -2 as well? 2. The linetable_entry::operator<() is not called in symtab.c:3215 (1acc9dca423f78e44553928f0de839b618c13766). It looks like changing the line to if (best && *item < *last && item->raw_pc () > best->raw_pc () && (best_end == 0 || best_end > item->pc (objfile))) best_end = item->pc (objfile); fixes the crash. I guess the item < last comparison doesn't use the linetable_entry::operator<() intentionally. I think assert started to fail because before 1acc9dca423f78e44553928f0de839b618c13766 best_end was compared to what is now called a raw_pc. For entries with address -2, best_end > item->pc was false. Now the comparison is with item->pc (objfile), and for entries with address -2, item->pc (objfile) wraps to a value below best_end, and best_end is updated. I wonder if it is expected that the best_end can come from an item with a line and a file different than the line and the file in best? (In reply to Dmitry Neverov from comment #13) > Maybe lnp_state_machine::check_line_address should check -2 as well? Can you try it and see if it helps? I get no crash if I change condition in lnp_state_machine::check_line_address to if ((address == 0 && address < unrelocated_lowpc) || address == (CORE_ADDR) -1 || address == (CORE_ADDR) -2) I think it would be worthwhile to submit a patch for this, then. The code could have a comment mentioning the clang/lld work here to explain the rationale for that -2. The master branch has been updated by Dmitrii Neverov <neverov@sourceware.org>: https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;h=e814012b2b108743e21b7ef2799310a0f4e0a86d commit e814012b2b108743e21b7ef2799310a0f4e0a86d Author: Dmitry Neverov <dmitry.neverov@jetbrains.com> Date: Sat Jun 8 10:41:31 2024 +0200 Recognize -2 as a tombstone value in .debug_line Commit a8caed5d7faa639a1e6769eba551d15d8ddd9510 handled the tombstone value -1 used by lld (https://reviews.llvm.org/D81784). The referenced lld commit also uses the tombstone value -2 for pre-DWARF-v5 (https://github.com/llvm/llvm-project/commit/e618ccbf431f6730edb6d1467a127c3a52fd57f7). If not handled, -2 breaks the pc step range calculation and triggers the assertion: gdb/infrun.c:2794: internal-error: resume_1: Assertion `pc_in_thread_step_range (pc, tp)' failed. This commit adds -2 tombstone value and handles it in the same way as -1. Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=31727 Approved-By: Tom Tromey <tom@tromey.com> Set the target milestone to 15.2 for this one, as it hides an issue is caused by a certain stub which has a bug in it where it deviates from the protocol specifications (described in https://sourceware.org/bugzilla/show_bug.cgi?id=31801 ). Due to the _apparent_ regression aspect, we marked that other PR for release 15.2, but in fact there is no fix to be done in that other PR, so marking this one for 15.2 instead. Not critical for 15.2, but if we can manage to get it in, it will help some users. Thus, next steps: * Have a Global Maintainer either approve or reject the backport of this patch to the gdb-15-branch; * If the backport is approved, cherry-pick the patch on gdb-15-branch, and then close; If the backport is rejected, then change the target milestone to 16.1, and then close. The gdb-15-branch branch has been updated by Dmitrii Neverov <neverov@sourceware.org>: https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;h=9542d1b3a05477c415d870b7b652cdde75a5c8ea commit 9542d1b3a05477c415d870b7b652cdde75a5c8ea Author: Dmitry Neverov <dmitry.neverov@jetbrains.com> Date: Sat Jun 8 10:41:31 2024 +0200 Recognize -2 as a tombstone value in .debug_line Commit a8caed5d7faa639a1e6769eba551d15d8ddd9510 handled the tombstone value -1 used by lld (https://reviews.llvm.org/D81784). The referenced lld commit also uses the tombstone value -2 for pre-DWARF-v5 (https://github.com/llvm/llvm-project/commit/e618ccbf431f6730edb6d1467a127c3a52fd57f7). If not handled, -2 breaks the pc step range calculation and triggers the assertion: gdb/infrun.c:2794: internal-error: resume_1: Assertion `pc_in_thread_step_range (pc, tp)' failed. This commit adds -2 tombstone value and handles it in the same way as -1. Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=31727 Cherry-picked from e814012b2b108743e21b7ef2799310a0f4e0a86d Approved-By: Tom Tromey <tom@tromey.com> closing, now that the patch has been backported |