Our customer encountered an internal GDB failure 0x08003384 in _idle_thread (p=0x0) at ./ChibiOS_20.3.3/os/rt/src/chsys.c:72 ../../gdb-10.1/gdb/psymtab.c:132: internal-error: bool partial_map_expand_apply(objfile*, const char*, const char*, partial_symtab*, gdb::function_view<bool(symtab*)>): Assertion `pst->user == NULL' failed. A problem internal to GDB has been detected, further debugging may prove unreliable. Quit this debugging session? (y or n) [answered Y; input not from terminal] This is a bug, please report it. For instructions, see: <https://www.gnu.org/software/gdb/bugs/>. ../../gdb-10.1/gdb/psymtab.c:132: internal-error: bool partial_map_expand_apply(objfile*, const char*, const char*, partial_symtab*, gdb::function_view<bool(symtab*)>): Assertion `pst->user == NULL' failed. A problem internal to GDB has been detected,\nfurther debugging may prove unreliable. Create a core file of GDB?(y or n) [answered Y; input not from terminal]
Hi Elmot, Would it be possible to provide a reproducer? It is otherwise a bit difficult to investigate. SImon
Hi, Simon, We've asked already the customer to provide more details here. Hopefully he/she will do that. ATM I can only tell that host machine is linux, target is bare metal 32bit ARM Cortex-M4, gdbserver is openocd, the application contains some cpp code + ChibiOS RTOS.
Hi, I'm the person who encountered the issue. This is the core dump file that was created. The file is too large to be directly attached. https://www.dropbox.com/s/qjym1bxckm480x5/core.gdb.1000.2ce4d3ac1c104df6992f66f6a01ebcbe.410030.1618625947000000000000.lz4?dl=0 I'm not familiar with the process of reporting a bug. Please tell me if there's more I can do. gcc-arm-none-eabi version is 9.3.1
(In reply to Ethan Zhang from comment #3) > Hi, > I'm the person who encountered the issue. > This is the core dump file that was created. The file is too large to be > directly attached. > > https://www.dropbox.com/s/qjym1bxckm480x5/core.gdb.1000. > 2ce4d3ac1c104df6992f66f6a01ebcbe.410030.1618625947000000000000.lz4?dl=0 > > I'm not familiar with the process of reporting a bug. > Please tell me if there's more I can do. > > gcc-arm-none-eabi version is 9.3.1 Hi, A core dump without the corresponding binary is not of much use. The ideal way is to provide a source file and some commands to compile that source file and reproduce the bug in GDB. This way, we can see for ourselves what are the steps that lead up to the bug. If needed, the source file can make use of ChibiOS if that's the only way you can reproduce it. If reproducing with a source file is not possible, then you could provide an already compiled binary with the instructions to get to the crash. If that is not possible, then you could provide a backtrace of GDB at the point of the crash, that's better than nothing. Simon
Note that this bug is probably already fixed on trunk, because that assert was removed by the "quick" simplification patches that landed today. If a fix is needed for 10.x, probably the assert can just be removed; or if not, it can be replaced with a loop that walks upward until it find a psymtab where user==null.
Currently it seems I can only reproduce it within "CLion" the IDE I'm using. The internal failure happens when I compile the source file using arm-none-eabi-gcc 9 and debug it with GDB 10 connected to openOCD. Every time I attempt to set a break point using the IDE triggers the internal GDB failure. With the exact same set up, directly setting a break point from a GDB terminal does not cause the failure. Even when debugging within the IDE, setting a break point from its own GDB terminal doesn't cause the internal failure. I'm not sure where to go from here as I don't know what's different between me setting a break point from command line and the IDE setting a break point.
(In reply to Ethan Zhang from comment #6) > Currently it seems I can only reproduce it within "CLion" the IDE I'm using. In CLion, there should be a way to get the "MI traces" or something like that, the log of the communication between CLion and GDB. I've never used CLion, so I can't really help you there. But if you can get that, it would be helpful to reproduce the problem. It will allow replaying the same steps, but without CLion. > > The internal failure happens when I compile the source file using > arm-none-eabi-gcc 9 and debug it with GDB 10 connected to openOCD. Every > time I attempt to set a break point using the IDE triggers the internal GDB > failure. > > With the exact same set up, directly setting a break point from a GDB > terminal does not cause the failure. Even when debugging within the IDE, > setting a break point from its own GDB terminal doesn't cause the internal > failure. > > I'm not sure where to go from here as I don't know what's different between > me setting a break point from command line and the IDE setting a break point. Since the bug happens while expanding a symtab, pPerhaps something that could reproduce the problem: 1. Load your binary in GDB 2. Use the command "maintenance expand-symtabs".
(In reply to Tom Tromey from comment #5) > Note that this bug is probably already fixed on trunk, > because that assert was removed by the "quick" simplification > patches that landed today. > > If a fix is needed for 10.x, probably the assert can just > be removed; or if not, it can be replaced with a loop > that walks upward until it find a psymtab where user==null. I don't know enough about this to make a call like that, so I trust you here.
There are two code paths that hit this. One is via completion, the other via linespecs and collect_symtabs_from_filename. I would guess something like "break file.c:7" could trigger this, for the appropriate filename. It's a bit hard to be certain; an MI trace would definitely show the problem.
(In reply to Tom Tromey from comment #9) > There are two code paths that hit this. One is via completion, > the other via linespecs and collect_symtabs_from_filename. > I would guess something like "break file.c:7" could trigger this, > for the appropriate filename. It's a bit hard to be certain; > an MI trace would definitely show the problem. Ok, so disregard my suggestion then :)
I have found it to be related to link time optimization and setting a break point with absolute path. Compile a piece of source code with gcc9 using link time optimization, for example: g++-9 -ggdb -flto test.cpp -o test Load "test" into GDB 10, then set break point using absolute path: break /path/to/test:(line number) That should be enough to reproduce the issue. And it seems to only appear when the program is compiled with gcc9 with lto, debug using gdb10 and have a break point set in absolute path. The content of the source code doesn't seem to make a difference. If the source code is compiled in gcc10 with lto, this issue doesn't appear. If the break point is set through relative path, this issue doesn't appear.
I think I have that GDB MI transcript but there is some Ethan's personal information like file paths. Before I upload it here I need Ethan Zhang's explicit permission to upload the transcript here. That's my company approach, sorry. Ethan?
(In reply to Elmot from comment #12) > I think I have that GDB MI transcript but there is some Ethan's personal > information like file paths. > > Before I upload it here I need Ethan Zhang's explicit permission to upload > the transcript here. That's my company approach, sorry. > > Ethan? If it's just file paths, I'm ok with it.
Created attachment 13379 [details] GDB MI transcript
(In reply to Ethan Zhang from comment #11) > I have found it to be related to link time optimization and setting a break > point with absolute path. > > Compile a piece of source code with gcc9 using link time optimization, for > example: > g++-9 -ggdb -flto test.cpp -o test > > Load "test" into GDB 10, then set break point using absolute path: > break /path/to/test:(line number) > > That should be enough to reproduce the issue. > > And it seems to only appear when the program is compiled with gcc9 with lto, > debug using gdb10 and have a break point set in absolute path. > > The content of the source code doesn't seem to make a difference. > If the source code is compiled in gcc10 with lto, this issue doesn't appear. > If the break point is set through relative path, this issue doesn't appear. Oh, that triggers the bug for me indeed! Here's my full setup: - Ubuntu 20.04 - Package gcc-arm-none-eabi (arm-none-eabi-gcc (15:9-2019-q4-0ubuntu1) 9.2.1 20191025 (release) [ARM/arm-9-branch revision 277599]) - Source file: ---8<--- extern "C" void _exit(int code) { for (;;); } int main(void) { return 0; } --->8--- - Compiler command: arm-none-eabi-g++ -ggdb -flto test.cpp -o test - GDB command: $ ./gdb -nx -q --data-directory=data-directory ~/test -ex "b /home/smarchi/test.cpp:7" Reading symbols from /home/smarchi/test... /home/smarchi/src/binutils-gdb/gdb/psymtab.c:132: internal-error: bool partial_map_expand_apply(objfile*, const char*, const char*, partial_symtab*, gdb::function_view<bool(symtab*)>): Assertion `pst->user == NULL' failed. A problem internal to GDB has been detected, further debugging may prove unreliable. Quit this debugging session? (y or n) I'll attach the binary shortly. And indeed, I don't see the problem with master.
Created attachment 13380 [details] Binary from comment 15
CLion ticket, just for reference https://youtrack.jetbrains.com/issue/CPP-24962
Simon, Tom, since the problem is fixed in your master, is there any known patch which we can apply or revert to fix our custom-built Gdb 10? And when that master fix is going to hit release?
(In reply to Elmot from comment #18) > since the problem is fixed in your master, is there any known patch which we > can apply or revert to fix our custom-built Gdb 10? It's probably not too easy. The immediate patch is 536a40f3a8d, and there's a series leading up to that. But, there are probably other earlier patches that you would also need, like at least some parts of the series including 9b99dcc8dbc. > And when that master fix is going to hit release? These patches will only appear in 11.1. Maybe we ought to try to fix this one in a more direct way for 10.x.
I have a fix but haven't attempted to write a test case. diff --git a/gdb/psymtab.c b/gdb/psymtab.c index 23eed6bc1c6..71b02caae51 100644 --- a/gdb/psymtab.c +++ b/gdb/psymtab.c @@ -127,9 +127,10 @@ partial_map_expand_apply (struct objfile *objfile, { struct compunit_symtab *last_made = objfile->compunit_symtabs; - /* Shared psymtabs should never be seen here. Instead they should - be handled properly by the caller. */ - gdb_assert (pst->user == NULL); + /* We may see a shared psymtab here, but we want to expand the + outermost symtab. */ + while (pst->user != nullptr) + pst = pst->user; /* Don't visit already-expanded psymtabs. */ if (pst->readin_p (objfile))
https://sourceware.org/pipermail/gdb-patches/2021-April/178119.html
Hello, Thanks Tom for having marked this as targeting 10.2. Given that we are planning on creating the 10.2 release tomorrow, or Sunday at the latest, I think we have unfortunately run out of time. Looking at this PR, my recommendation would be to either: - Work around the issue by building the program with different compilation options, while we work on releasing GDB 11, which will have the fix; - Build GDB from a recent version from the master branch; - Build a custom GDB using the gdb-10-branch after having included Tom's patch. For info, once GDB 10.2 is out, my plan is to start the GDB 11 release process right after. From past experiences, it usually takes a couple of months for things to stabilize before we create the first release. It could take longer, but it should remain a question of months.
The submit/pr-27743-psym-fix branch has been updated by Tom Tromey <tromey@sourceware.org>: https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;h=e7d77ce0c408e7019f9885b8be64c9cdb46dd312 commit e7d77ce0c408e7019f9885b8be64c9cdb46dd312 Author: Tom Tromey <tromey@adacore.com> Date: Fri Apr 23 11:28:48 2021 -0600 Fix crash when expanding partial symtabs with DW_TAG_imported_unit PR gdb/27743 points out a gdb crash when expanding partial symtabs, where one of the compilation units uses DW_TAG_imported_unit. The bug is that partial_map_expand_apply expects only to be called for the outermost psymtab. However, filename searching doesn't (and probably shouldn't) guarantee this. The fix is to walk upward to find the outermost CU. A new test case is included. It is mostly copied from other test cases, which really sped up the effort. This bug does not occur on trunk. There, psym_map_symtabs_matching_filename is gone, replaced by psymbol_functions::expand_symtabs_matching. When this find a match, it calls psymtab_to_symtab, which does this same upward walk. Tested on x86-64 Fedora 32. I propose checking in this patch on the gdb-10 branch, and just the new test case on trunk. gdb/ChangeLog 2021-04-23 Tom Tromey <tromey@adacore.com> PR gdb/27743: * psymtab.c (partial_map_expand_apply): Expand outermost psymtab. gdb/testsuite/ChangeLog 2021-04-23 Tom Tromey <tromey@adacore.com> PR gdb/27743: * gdb.dwarf2/imported-unit-bp.exp: New file. * gdb.dwarf2/imported-unit-bp-main.c: New file. * gdb.dwarf2/imported-unit-bp-alt.c: New file.
The gdb-10-branch branch has been updated by Tom Tromey <tromey@sourceware.org>: https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;h=e7d77ce0c408e7019f9885b8be64c9cdb46dd312 commit e7d77ce0c408e7019f9885b8be64c9cdb46dd312 Author: Tom Tromey <tromey@adacore.com> Date: Fri Apr 23 11:28:48 2021 -0600 Fix crash when expanding partial symtabs with DW_TAG_imported_unit PR gdb/27743 points out a gdb crash when expanding partial symtabs, where one of the compilation units uses DW_TAG_imported_unit. The bug is that partial_map_expand_apply expects only to be called for the outermost psymtab. However, filename searching doesn't (and probably shouldn't) guarantee this. The fix is to walk upward to find the outermost CU. A new test case is included. It is mostly copied from other test cases, which really sped up the effort. This bug does not occur on trunk. There, psym_map_symtabs_matching_filename is gone, replaced by psymbol_functions::expand_symtabs_matching. When this find a match, it calls psymtab_to_symtab, which does this same upward walk. Tested on x86-64 Fedora 32. I propose checking in this patch on the gdb-10 branch, and just the new test case on trunk. gdb/ChangeLog 2021-04-23 Tom Tromey <tromey@adacore.com> PR gdb/27743: * psymtab.c (partial_map_expand_apply): Expand outermost psymtab. gdb/testsuite/ChangeLog 2021-04-23 Tom Tromey <tromey@adacore.com> PR gdb/27743: * gdb.dwarf2/imported-unit-bp.exp: New file. * gdb.dwarf2/imported-unit-bp-main.c: New file. * gdb.dwarf2/imported-unit-bp-alt.c: New file.
The master branch has been updated by Tom Tromey <tromey@sourceware.org>: https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;h=e8b6c1da565c93f015d93a4f8554830118e9bd07 commit e8b6c1da565c93f015d93a4f8554830118e9bd07 Author: Tom Tromey <tromey@adacore.com> Date: Mon Apr 26 09:53:32 2021 -0600 Add test case for gdb 10 crash PR gdb/27743 points out a gdb crash when expanding partial symtabs, where one of the compilation units uses DW_TAG_imported_unit. This crash happens for gdb 10, but not git trunk. This patch pulls over the new test case only. gdb/testsuite/ChangeLog 2021-04-26 Tom Tromey <tromey@adacore.com> PR gdb/27743: * gdb.dwarf2/imported-unit-bp.exp: New file. * gdb.dwarf2/imported-unit-bp-main.c: New file. * gdb.dwarf2/imported-unit-bp-alt.c: New file.
I've checked in the fix. It isn't in the 10.2 release, but I did put the fix on the gdb 10 branch. There probably won't be a 10.3, but it's there in case someone wants to do their own build from git, or pick the patch into their local repo. git trunk doesn't have the bug, but I've put the test case there. I've reset the target milestone to reflect that this isn't really in 10.2.