Bug 27743 - Internal error psymtab.c
Summary: Internal error psymtab.c
Status: RESOLVED FIXED
Alias: None
Product: gdb
Classification: Unclassified
Component: gdb (show other bugs)
Version: 10.1
: P2 normal
Target Milestone: 11.1
Assignee: Tom Tromey
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2021-04-16 12:57 UTC by Elmot
Modified: 2021-04-26 15:55 UTC (History)
5 users (show)

See Also:
Host:
Target:
Build:
Last reconfirmed:


Attachments
GDB MI transcript (1.77 KB, text/plain)
2021-04-19 14:41 UTC, Elmot
Details
Binary from comment 15 (11.88 KB, application/x-executable)
2021-04-19 14:46 UTC, Simon Marchi
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Elmot 2021-04-16 12:57:39 UTC
Our customer encountered an internal GDB failure

0x08003384 in _idle_thread (p=0x0) at ./ChibiOS_20.3.3/os/rt/src/chsys.c:72 
../../gdb-10.1/gdb/psymtab.c:132: internal-error: bool partial_map_expand_apply(objfile*, const char*, const char*, partial_symtab*, gdb::function_view<bool(symtab*)>): Assertion `pst->user == NULL' failed.
A problem internal to GDB has been detected,
further debugging may prove unreliable.
Quit this debugging session?
(y or n) [answered Y; input not from terminal]
This is a bug, please report it. For instructions, see:
<https://www.gnu.org/software/gdb/bugs/>.

../../gdb-10.1/gdb/psymtab.c:132: internal-error: bool partial_map_expand_apply(objfile*, const char*, const char*, partial_symtab*, gdb::function_view<bool(symtab*)>): Assertion `pst->user == NULL' failed.
A problem internal to GDB has been detected,\nfurther debugging may prove unreliable.
Create a core file of GDB?(y or n) [answered Y; input not from terminal]
Comment 1 Simon Marchi 2021-04-16 14:09:14 UTC
Hi Elmot,

Would it be possible to provide a reproducer?  It is otherwise a bit difficult to investigate.

SImon
Comment 2 Elmot 2021-04-16 14:31:55 UTC
Hi, Simon, 

We've asked already the customer to provide more details here. Hopefully he/she will do that.

ATM I can only tell that host machine is linux, target is bare metal 32bit ARM Cortex-M4, gdbserver is openocd, the application contains some cpp code + ChibiOS RTOS.
Comment 3 Ethan Zhang 2021-04-17 03:04:41 UTC
Hi,
I'm the person who encountered the issue.
This is the core dump file that was created. The file is too large to be directly attached.

https://www.dropbox.com/s/qjym1bxckm480x5/core.gdb.1000.2ce4d3ac1c104df6992f66f6a01ebcbe.410030.1618625947000000000000.lz4?dl=0

I'm not familiar with the process of reporting a bug.
Please tell me if there's more I can do.

gcc-arm-none-eabi version is 9.3.1
Comment 4 Simon Marchi 2021-04-17 13:06:51 UTC
(In reply to Ethan Zhang from comment #3)
> Hi,
> I'm the person who encountered the issue.
> This is the core dump file that was created. The file is too large to be
> directly attached.
> 
> https://www.dropbox.com/s/qjym1bxckm480x5/core.gdb.1000.
> 2ce4d3ac1c104df6992f66f6a01ebcbe.410030.1618625947000000000000.lz4?dl=0
> 
> I'm not familiar with the process of reporting a bug.
> Please tell me if there's more I can do.
> 
> gcc-arm-none-eabi version is 9.3.1

Hi,

A core dump without the corresponding binary is not of much use.

The ideal way is to provide a source file and some commands to compile that source file and reproduce the bug in GDB.  This way, we can see for ourselves what are the steps that lead up to the bug.  If needed, the source file can make use of ChibiOS if that's the only way you can reproduce it.

If reproducing with a source file is not possible, then you could provide an already compiled binary with the instructions to get to the crash.

If that is not possible, then you could provide a backtrace of GDB at the point of the crash, that's better than nothing.

Simon
Comment 5 Tom Tromey 2021-04-17 15:46:28 UTC
Note that this bug is probably already fixed on trunk,
because that assert was removed by the "quick" simplification
patches that landed today.

If a fix is needed for 10.x, probably the assert can just
be removed; or if not, it can be replaced with a loop
that walks upward until it find a psymtab where user==null.
Comment 6 Ethan Zhang 2021-04-18 04:56:33 UTC
Currently it seems I can only reproduce it within "CLion" the IDE I'm using.

The internal failure happens when I compile the source file using arm-none-eabi-gcc 9 and debug it with GDB 10 connected to openOCD. Every time I attempt to set a break point using the IDE triggers the internal GDB failure.

With the exact same set up, directly setting a break point from a GDB terminal does not cause the failure. Even when debugging within the IDE, setting a break point from its own GDB terminal doesn't cause the internal failure.

I'm not sure where to go from here as I don't know what's different between me setting a break point from command line and the IDE setting a break point.
Comment 7 Simon Marchi 2021-04-19 00:56:13 UTC
(In reply to Ethan Zhang from comment #6)
> Currently it seems I can only reproduce it within "CLion" the IDE I'm using.

In CLion, there should be a way to get the "MI traces" or something like that, the log of the communication between CLion and GDB.  I've never used CLion, so I can't really help you there.  But if you can get that, it would be helpful to reproduce the problem.  It will allow replaying the same steps, but without CLion.

> 
> The internal failure happens when I compile the source file using
> arm-none-eabi-gcc 9 and debug it with GDB 10 connected to openOCD. Every
> time I attempt to set a break point using the IDE triggers the internal GDB
> failure.
> 
> With the exact same set up, directly setting a break point from a GDB
> terminal does not cause the failure. Even when debugging within the IDE,
> setting a break point from its own GDB terminal doesn't cause the internal
> failure.
> 
> I'm not sure where to go from here as I don't know what's different between
> me setting a break point from command line and the IDE setting a break point.

Since the bug happens while expanding a symtab, pPerhaps something that could reproduce the problem:

1. Load your binary in GDB
2. Use the command "maintenance expand-symtabs".
Comment 8 Simon Marchi 2021-04-19 00:56:37 UTC
(In reply to Tom Tromey from comment #5)
> Note that this bug is probably already fixed on trunk,
> because that assert was removed by the "quick" simplification
> patches that landed today.
> 
> If a fix is needed for 10.x, probably the assert can just
> be removed; or if not, it can be replaced with a loop
> that walks upward until it find a psymtab where user==null.

I don't know enough about this to make a call like that, so I trust you here.
Comment 9 Tom Tromey 2021-04-19 01:51:43 UTC
There are two code paths that hit this.  One is via completion,
the other via linespecs and collect_symtabs_from_filename.
I would guess something like "break file.c:7" could trigger this,
for the appropriate filename.  It's a bit hard to be certain;
an MI trace would definitely show the problem.
Comment 10 Simon Marchi 2021-04-19 01:54:43 UTC
(In reply to Tom Tromey from comment #9)
> There are two code paths that hit this.  One is via completion,
> the other via linespecs and collect_symtabs_from_filename.
> I would guess something like "break file.c:7" could trigger this,
> for the appropriate filename.  It's a bit hard to be certain;
> an MI trace would definitely show the problem.

Ok, so disregard my suggestion then :)
Comment 11 Ethan Zhang 2021-04-19 11:55:56 UTC
I have found it to be related to link time optimization and setting a break point with absolute path.

Compile a piece of source code with gcc9 using link time optimization, for example:
    g++-9 -ggdb -flto test.cpp -o test

Load "test" into GDB 10, then set break point using absolute path:
    break /path/to/test:(line number)

That should be enough to reproduce the issue.

And it seems to only appear when the program is compiled with gcc9 with lto, debug using gdb10 and have a break point set in absolute path.

The content of the source code doesn't seem to make a difference.
If the source code is compiled in gcc10 with lto, this issue doesn't appear.
If the break point is set through relative path, this issue doesn't appear.
Comment 12 Elmot 2021-04-19 14:04:54 UTC
I think I have that GDB MI transcript but there is some Ethan's personal information like file paths. 

Before I upload it here I need Ethan Zhang's explicit permission to upload the transcript here. That's my company approach, sorry.

Ethan?
Comment 13 Ethan Zhang 2021-04-19 14:23:24 UTC
(In reply to Elmot from comment #12)
> I think I have that GDB MI transcript but there is some Ethan's personal
> information like file paths. 
> 
> Before I upload it here I need Ethan Zhang's explicit permission to upload
> the transcript here. That's my company approach, sorry.
> 
> Ethan?

If it's just file paths, I'm ok with it.
Comment 14 Elmot 2021-04-19 14:41:16 UTC
Created attachment 13379 [details]
GDB MI transcript
Comment 15 Simon Marchi 2021-04-19 14:45:59 UTC
(In reply to Ethan Zhang from comment #11)
> I have found it to be related to link time optimization and setting a break
> point with absolute path.
> 
> Compile a piece of source code with gcc9 using link time optimization, for
> example:
>     g++-9 -ggdb -flto test.cpp -o test
> 
> Load "test" into GDB 10, then set break point using absolute path:
>     break /path/to/test:(line number)
> 
> That should be enough to reproduce the issue.
> 
> And it seems to only appear when the program is compiled with gcc9 with lto,
> debug using gdb10 and have a break point set in absolute path.
> 
> The content of the source code doesn't seem to make a difference.
> If the source code is compiled in gcc10 with lto, this issue doesn't appear.
> If the break point is set through relative path, this issue doesn't appear.

Oh, that triggers the bug for me indeed!  Here's my full setup:

- Ubuntu 20.04
- Package gcc-arm-none-eabi (arm-none-eabi-gcc (15:9-2019-q4-0ubuntu1) 9.2.1 20191025 (release) [ARM/arm-9-branch revision 277599])
- Source file:

---8<---
extern "C" void _exit(int code)
{
  for (;;);
}

int main(void) {
    return 0;
}
--->8---

- Compiler command: arm-none-eabi-g++ -ggdb -flto test.cpp -o test
- GDB command:

$ ./gdb -nx -q --data-directory=data-directory ~/test -ex "b /home/smarchi/test.cpp:7"
Reading symbols from /home/smarchi/test...
/home/smarchi/src/binutils-gdb/gdb/psymtab.c:132: internal-error: bool partial_map_expand_apply(objfile*, const char*, const char*, partial_symtab*, gdb::function_view<bool(symtab*)>): Assertion `pst->user == NULL' failed.
A problem internal to GDB has been detected,
further debugging may prove unreliable.
Quit this debugging session? (y or n) 

I'll attach the binary shortly.

And indeed, I don't see the problem with master.
Comment 16 Simon Marchi 2021-04-19 14:46:41 UTC
Created attachment 13380 [details]
Binary from comment 15
Comment 17 Elmot 2021-04-23 12:03:05 UTC
CLion ticket, just for reference https://youtrack.jetbrains.com/issue/CPP-24962
Comment 18 Elmot 2021-04-23 12:54:06 UTC
Simon, Tom,

since the problem is fixed in your master, is there any known patch which we can apply or revert to fix our custom-built Gdb 10?

And when that master fix is going to hit release?
Comment 19 Tom Tromey 2021-04-23 15:27:01 UTC
(In reply to Elmot from comment #18)

> since the problem is fixed in your master, is there any known patch which we
> can apply or revert to fix our custom-built Gdb 10?

It's probably not too easy.
The immediate patch is 536a40f3a8d, and there's a series leading
up to that.
But, there are probably other earlier patches that you would also need,
like at least some parts of the series including 9b99dcc8dbc.

> And when that master fix is going to hit release?

These patches will only appear in 11.1.

Maybe we ought to try to fix this one in a more direct way for 10.x.
Comment 20 Tom Tromey 2021-04-23 15:34:42 UTC
I have a fix but haven't attempted to write a test case.

diff --git a/gdb/psymtab.c b/gdb/psymtab.c
index 23eed6bc1c6..71b02caae51 100644
--- a/gdb/psymtab.c
+++ b/gdb/psymtab.c
@@ -127,9 +127,10 @@ partial_map_expand_apply (struct objfile *objfile,
 {
   struct compunit_symtab *last_made = objfile->compunit_symtabs;
 
-  /* Shared psymtabs should never be seen here.  Instead they should
-     be handled properly by the caller.  */
-  gdb_assert (pst->user == NULL);
+  /* We may see a shared psymtab here, but we want to expand the
+     outermost symtab.  */
+  while (pst->user != nullptr)
+    pst = pst->user;
 
   /* Don't visit already-expanded psymtabs.  */
   if (pst->readin_p (objfile))
Comment 22 Joel Brobecker 2021-04-23 18:49:16 UTC
Hello,

Thanks Tom for having marked this as targeting 10.2.

Given that we are planning on creating the 10.2 release tomorrow, or Sunday at the latest, I think we have unfortunately run out of time. Looking at this PR, my recommendation would be to either:
  - Work around the issue by building the program with different compilation options, while we work on releasing GDB 11, which will have the fix;
  - Build GDB from a recent version from the master branch;
  - Build a custom GDB using the gdb-10-branch after having included Tom's patch.

For info, once GDB 10.2 is out, my plan is to start the GDB 11 release process right after. From past experiences, it usually takes a couple of months for things to stabilize before we create the first release. It could take longer, but it should remain a question of months.
Comment 23 Sourceware Commits 2021-04-26 15:39:58 UTC
The submit/pr-27743-psym-fix branch has been updated by Tom Tromey <tromey@sourceware.org>:

https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;h=e7d77ce0c408e7019f9885b8be64c9cdb46dd312

commit e7d77ce0c408e7019f9885b8be64c9cdb46dd312
Author: Tom Tromey <tromey@adacore.com>
Date:   Fri Apr 23 11:28:48 2021 -0600

    Fix crash when expanding partial symtabs with DW_TAG_imported_unit
    
    PR gdb/27743 points out a gdb crash when expanding partial symtabs,
    where one of the compilation units uses DW_TAG_imported_unit.
    
    The bug is that partial_map_expand_apply expects only to be called for
    the outermost psymtab.  However, filename searching doesn't (and
    probably shouldn't) guarantee this.  The fix is to walk upward to find
    the outermost CU.
    
    A new test case is included.  It is mostly copied from other test
    cases, which really sped up the effort.
    
    This bug does not occur on trunk.  There,
    psym_map_symtabs_matching_filename is gone, replaced by
    psymbol_functions::expand_symtabs_matching.  When this find a match,
    it calls psymtab_to_symtab, which does this same upward walk.
    
    Tested on x86-64 Fedora 32.
    
    I propose checking in this patch on the gdb-10 branch, and just the
    new test case on trunk.
    
    gdb/ChangeLog
    2021-04-23  Tom Tromey  <tromey@adacore.com>
    
            PR gdb/27743:
            * psymtab.c (partial_map_expand_apply): Expand outermost psymtab.
    
    gdb/testsuite/ChangeLog
    2021-04-23  Tom Tromey  <tromey@adacore.com>
    
            PR gdb/27743:
            * gdb.dwarf2/imported-unit-bp.exp: New file.
            * gdb.dwarf2/imported-unit-bp-main.c: New file.
            * gdb.dwarf2/imported-unit-bp-alt.c: New file.
Comment 24 Sourceware Commits 2021-04-26 15:40:53 UTC
The gdb-10-branch branch has been updated by Tom Tromey <tromey@sourceware.org>:

https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;h=e7d77ce0c408e7019f9885b8be64c9cdb46dd312

commit e7d77ce0c408e7019f9885b8be64c9cdb46dd312
Author: Tom Tromey <tromey@adacore.com>
Date:   Fri Apr 23 11:28:48 2021 -0600

    Fix crash when expanding partial symtabs with DW_TAG_imported_unit
    
    PR gdb/27743 points out a gdb crash when expanding partial symtabs,
    where one of the compilation units uses DW_TAG_imported_unit.
    
    The bug is that partial_map_expand_apply expects only to be called for
    the outermost psymtab.  However, filename searching doesn't (and
    probably shouldn't) guarantee this.  The fix is to walk upward to find
    the outermost CU.
    
    A new test case is included.  It is mostly copied from other test
    cases, which really sped up the effort.
    
    This bug does not occur on trunk.  There,
    psym_map_symtabs_matching_filename is gone, replaced by
    psymbol_functions::expand_symtabs_matching.  When this find a match,
    it calls psymtab_to_symtab, which does this same upward walk.
    
    Tested on x86-64 Fedora 32.
    
    I propose checking in this patch on the gdb-10 branch, and just the
    new test case on trunk.
    
    gdb/ChangeLog
    2021-04-23  Tom Tromey  <tromey@adacore.com>
    
            PR gdb/27743:
            * psymtab.c (partial_map_expand_apply): Expand outermost psymtab.
    
    gdb/testsuite/ChangeLog
    2021-04-23  Tom Tromey  <tromey@adacore.com>
    
            PR gdb/27743:
            * gdb.dwarf2/imported-unit-bp.exp: New file.
            * gdb.dwarf2/imported-unit-bp-main.c: New file.
            * gdb.dwarf2/imported-unit-bp-alt.c: New file.
Comment 25 Sourceware Commits 2021-04-26 15:53:47 UTC
The master branch has been updated by Tom Tromey <tromey@sourceware.org>:

https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;h=e8b6c1da565c93f015d93a4f8554830118e9bd07

commit e8b6c1da565c93f015d93a4f8554830118e9bd07
Author: Tom Tromey <tromey@adacore.com>
Date:   Mon Apr 26 09:53:32 2021 -0600

    Add test case for gdb 10 crash
    
    PR gdb/27743 points out a gdb crash when expanding partial symtabs,
    where one of the compilation units uses DW_TAG_imported_unit.
    
    This crash happens for gdb 10, but not git trunk.  This patch pulls
    over the new test case only.
    
    gdb/testsuite/ChangeLog
    2021-04-26  Tom Tromey  <tromey@adacore.com>
    
            PR gdb/27743:
            * gdb.dwarf2/imported-unit-bp.exp: New file.
            * gdb.dwarf2/imported-unit-bp-main.c: New file.
            * gdb.dwarf2/imported-unit-bp-alt.c: New file.
Comment 26 Tom Tromey 2021-04-26 15:55:45 UTC
I've checked in the fix.

It isn't in the 10.2 release, but I did put the fix on the gdb 10 branch.
There probably won't be a 10.3, but it's there in case someone wants
to do their own build from git, or pick the patch into their local repo.

git trunk doesn't have the bug, but I've put the test case there.

I've reset the target milestone to reflect that this isn't really in 10.2.