Bug 28900 - GDB crash on loading symbols form openbios binary
Summary: GDB crash on loading symbols form openbios binary
Status: NEW
Alias: None
Product: gdb
Classification: Unclassified
Component: symtab (show other bugs)
Version: 11.2
: P2 normal
Target Milestone: ---
Assignee: Not yet assigned to anyone
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2022-02-16 22:02 UTC by Glenn Washburn
Modified: 2022-04-29 19:54 UTC (History)
3 users (show)

See Also:
Host:
Target:
Build:
Last reconfirmed: 2022-02-17 00:00:00


Attachments
OpenBIOS qemu-ppc build with symbols (266.36 KB, application/x-xz)
2022-02-16 22:02 UTC, Glenn Washburn
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Glenn Washburn 2022-02-16 22:02:51 UTC
Created attachment 13985 [details]
OpenBIOS qemu-ppc build with symbols

GDB crashes when loading the symbols from the attached binary, which is the qemu-ppc build of openbios.

I'm using this toolchain from builtroot:
https://toolchains.bootlin.com/downloads/releases/toolchains/powerpc
64le-power8/tarballs/powerpc64le-power8--glibc--bleeding-edge-2021.11-2.tar.bz2

To reproduce the issue, run the following command:
./powerpc64le-power8--glibc--bleeding-edge-2021.11-2/bin/powerpc64le-buildroot-linux-gnu-gdb -ex 'set verbose on' -ex 'set debug symfile on' -ex 'set debug symtab-create 99' -ex "symbol-file $PWD/openbios-qemu.elf.nostrip"

The last few lines of output, which give a clue to the issue are:

Recording minsym:  mst_text                       0xfff33824     1  escc_init
Recording minsym:  mst_text                       0xfff2b950     1  adb_cmd
Recording minsym:  mst_text                       0xfff0938c     1  free
Recording minsym:  mst_bss                        0xfffdd150     6  FSYS_BUF
Installing 2364 minimal symbols of objfile /home/user/openbios-qemu.elf.nostrip.
Done reading minimal symbols.
sf->sym_relocate (openbios-qemu.elf.nostrip, 0x22607a8, 0x0) = 0x0
Aborted
Comment 1 Simon Marchi 2022-02-17 00:43:30 UTC
The crash happens at:

#0  __sanitizer::internal__exit (exitcode=1) at /build/gcc/src/gcc/libsanitizer/sanitizer_common/sanitizer_linux.cpp:448
#1  0x00007ffff5e69b79 in __sanitizer::Die () at /build/gcc/src/gcc/libsanitizer/sanitizer_common/sanitizer_termination.cpp:59
#2  0x00007ffff5e4c5b7 in __ubsan::__ubsan_handle_type_mismatch_v1_abort (Data=<optimized out>, Pointer=<optimized out>) at /build/gcc/src/gcc/libsanitizer/ubsan/ubsan_handlers.cpp:148
#3  0x000055555da9747b in read_dbx_symtab (reader=..., partial_symtabs=0x6120000c37c0, objfile=0x613000005600) at /home/simark/src/binutils-gdb/gdb/dbxread.c:1712
#4  0x000055555da8b510 in dbx_symfile_read (objfile=0x613000005600, symfile_flags=...) at /home/simark/src/binutils-gdb/gdb/dbxread.c:547
#5  0x000055555daa7b78 in elfstab_build_psymtabs (objfile=0x613000005600, stabsect=0x62100010d100, stabstroffset=1337500, stabstrsize=1921) at /home/simark/src/binutils-gdb/gdb/dbxread.c:3050
#6  0x000055555e028c13 in elf_symfile_read (objfile=0x613000005600, symfile_flags=...) at /home/simark/src/binutils-gdb/gdb/elfread.c:1249
#7  0x000055555f18091f in read_symbols (objfile=0x613000005600, add_flags=...) at /home/simark/src/binutils-gdb/gdb/symfile.c:772
#8  0x000055555f182931 in syms_from_objfile_1 (objfile=0x613000005600, addrs=0x7fffffffc9d0, add_flags=...) at /home/simark/src/binutils-gdb/gdb/symfile.c:968
#9  0x000055555f182bd4 in syms_from_objfile (objfile=0x613000005600, addrs=0x0, add_flags=<error reading variable: Cannot access memory at address 0xffffffffffffff9a>) at /home/simark/src/binutils-gdb/gdb/symfile.c:985
#10 0x000055555f183c1e in symbol_file_add_with_addrs (abfd=0x6120000c3640, name=0x7fffffffe11f "/home/simark/Downloads/openbios-qemu.elf.nostrip", add_flags=..., addrs=0x0, flags=..., parent=0x0) at /home/simark/src/binutils-gdb/gdb/symfile.c:1088
#11 0x000055555f184b61 in symbol_file_add_from_bfd (abfd=0x6120000c3640, name=0x7fffffffe11f "/home/simark/Downloads/openbios-qemu.elf.nostrip", add_flags=<error reading variable: Cannot access memory at address 0xffffffffffffff8a>, addrs=0x0, flags=<error reading variable: Cannot access memory at address 0xffffffffffffff9a>, parent=0x0) at /home/simark/src/binutils-gdb/gdb/symfile.c:1168
#12 0x000055555f184d0d in symbol_file_add (name=0x7fffffffe11f "/home/simark/Downloads/openbios-qemu.elf.nostrip", add_flags=..., addrs=0x0, flags=...) at /home/simark/src/binutils-gdb/gdb/symfile.c:1181
#13 0x000055555f1850b5 in symbol_file_add_main_1 (args=0x7fffffffe11f "/home/simark/Downloads/openbios-qemu.elf.nostrip", add_flags=..., flags=..., reloff=0x0) at /home/simark/src/binutils-gdb/gdb/symfile.c:1205
#14 0x000055555f184ebb in symbol_file_add_main (args=0x7fffffffe11f "/home/simark/Downloads/openbios-qemu.elf.nostrip", add_flags=...) at /home/simark/src/binutils-gdb/gdb/symfile.c:1196
#15 0x000055555e721950 in symbol_file_add_main_adapter (arg=0x7fffffffe11f "/home/simark/Downloads/openbios-qemu.elf.nostrip", from_tty=1) at /home/simark/src/binutils-gdb/gdb/main.c:550
#16 0x000055555e721728 in catch_command_errors (command=0x55555e7217be <symbol_file_add_main_adapter(char const*, int)>, arg=0x7fffffffe11f "/home/simark/Downloads/openbios-qemu.elf.nostrip", from_tty=1, do_bp_actions=false) at /home/simark/src/binutils-gdb/gdb/main.c:523
#17 0x000055555e725a61 in captured_main_1 (context=0x7fffffffdb90) at /home/simark/src/binutils-gdb/gdb/main.c:1231
#18 0x000055555e726caa in captured_main (data=0x7fffffffdb90) at /home/simark/src/binutils-gdb/gdb/main.c:1338
#19 0x000055555e726d8c in gdb_main (args=0x7fffffffdb90) at /home/simark/src/binutils-gdb/gdb/main.c:1363
#20 0x000055555cf37bc9 in main (argc=5, argv=0x7fffffffdd08) at /home/simark/src/binutils-gdb/gdb/gdb.c:32


This is in stabs reading code.  In your binary you have both stabs and DWARF debug info:

  [10] .debug_line       PROGBITS        00000000 0d833c 0175a1 00      0   0  1
  [11] .debug_info       PROGBITS        00000000 0ef8dd 03e872 00      0   0  1
  [12] .debug_abbrev     PROGBITS        00000000 12e14f 00c36c 00      0   0  1
  [13] .debug_aranges    PROGBITS        00000000 13a4c0 000e88 00      0   0  8
  [14] .debug_str        PROGBITS        00000000 13b348 00ae02 01  MS  0   0  1
  [15] .debug_ranges     PROGBITS        00000000 146150 000230 00      0   0  8
  [16] .stab             PROGBITS        00000000 146380 00051c 0c     17   0  4
  [17] .stabstr          STRTAB          00000000 14689c 000781 00      0   0  1

This usually means that some compilation units that went into this binary had DWARF, and some had stabs.  My advice would be to find what is producing stabs in your workflow and change it to DWARF, if possible.

GDB crashing on your binary is not good at all, but given that the stabs format has been deprecated for a long time, the chances of somebody looking into fixing this are slim.  This is actually a good argument for eventually getting rid of the stabs reader in GDB.  Keeping this bit-rotting code is not harmless, it prevents you from debugging your binary.  If we didn't have the stabs reading code, then you could debug your binary just fine, just not those compilation units that use stabs.
Comment 2 Glenn Washburn 2022-02-17 16:17:45 UTC
Thanks for looking into this Simon.

The offending code was copied from the linux kernel and the latest kernel still has this code that generates a stabs section[1]. So I'm guessing GDB would also crash when loading the kernel compiled for 32-bit PPC.

Do you have recommendations as a longer term solution to fix this issue in the kernel source? Just removing the .stabs directive seems like it would work around the GDB crash. However, then those debugging symbols don't get added. Should they get added in DWARF sections? I didn't see equivalent directive to add symbols for DWARF in the assembler.

[1] https://elixir.bootlin.com/linux/v5.17-rc4/source/arch/powerpc/include/asm/ppc_asm.h#L211
Comment 3 Simon Marchi 2022-02-17 16:26:46 UTC
PowerPC64 defines _GLOBAL like this:

#define _GLOBAL(name) \
	.align 2 ; \
	.type name,@function; \
	.globl name; \
name:

I guess this generates an ELF function/text symbol, which is enough to be able to put a breakpoint on it or print its address.  It's only PowerPC32 that does:

#define _GLOBAL(n)	\
	.stabs __stringify(n:F-1),N_FUN,0,0,n;\
	.globl n;	\
n:

Is there a reason that PowerPC32 can't do the same as PowerPC64, let the compiler generate an ELF function/text symbol for the label?
Comment 4 Simon Marchi 2022-02-17 16:28:18 UTC
And to be clear, if someone has the time and motivation to fix GDB, it would be very appreciated, I'm not saying that we should not fix GDB.
Comment 5 Andreas Schwab 2022-02-17 16:47:04 UTC
I guess it's just a leftover from the pre-ELF days.
Comment 6 Glenn Washburn 2022-02-17 18:44:23 UTC
(In reply to Simon Marchi from comment #3)
> PowerPC64 defines _GLOBAL like this:
> 
> #define _GLOBAL(name) \
> 	.align 2 ; \
> 	.type name,@function; \
> 	.globl name; \
> name:
> 
> I guess this generates an ELF function/text symbol, which is enough to be
> able to put a breakpoint on it or print its address.  It's only PowerPC32
> that does:
> 
> #define _GLOBAL(n)	\
> 	.stabs __stringify(n:F-1),N_FUN,0,0,n;\
> 	.globl n;	\
> n:
> 
> Is there a reason that PowerPC32 can't do the same as PowerPC64, let the
> compiler generate an ELF function/text symbol for the label?

I don't know the answer to that, as this isn't really my area (not very familiar with the architecture nor debugging formats). My suspicion is that there is no reason. When I remove the .stabs directive and load the binary into GDB, it sees the generated symbols using "info symbol" and "p &<symbol>". I just don't know if I'm losing debugging capability by doing this (eg. not able to call those functions).

It seems odd to me that whoever added this PPC 64-bit code to the kernel wouldn't have used common code if that was a reasonable thing to do. So maybe it not that simple.

And yes, of course a fix in GDB would be great. I also understand that there's probably low motivation as this is an ancient format, so a fix will not likely be forth coming soon. I also don't expect stabs support to be removed anytime soon, as there's always people on legacy software and they'll probably want to keep what's there.
Comment 7 Tom Tromey 2022-02-18 16:52:09 UTC
(In reply to Glenn Washburn from comment #6)

>  I also don't expect stabs support to be
> removed anytime soon, as there's always people on legacy software and
> they'll probably want to keep what's there.

I think the issue in gdb is that there just aren't really stabs
experts around any more.  I've worked on gdb "forever" (like > 10 years)
and never had to touch stabs.

So, while we'd probably accept a fix for this, if it seemed clean somehow
(I really don't know how I'd make that assessment given the minimal
knowledge I have...), most likely nobody is going to pick this up.

Also, worth noting, GCC is finally going to get rid of stabs output:
https://gcc.gnu.org/gcc-12/changes.html

So perhaps fixing the kernel is the best outcome here.
Comment 8 Simon Marchi 2022-02-18 16:57:26 UTC
> Also, worth noting, GCC is finally going to get rid of stabs output:
> https://gcc.gnu.org/gcc-12/changes.html

Ah good, to know!  I poked the bear by asking this question, glad they followed through.

https://www.mail-archive.com/gcc@gcc.gnu.org/msg94701.html
Comment 9 Andrew Burgess 2022-02-19 19:53:17 UTC
I posted these patches to address the crash in gdb:

  https://sourceware.org/pipermail/gdb-patches/2022-February/186014.html

I've made no attempt to fully read in the debug information, we just skip over the offending items.  But at least we don't crash...
Comment 10 Sourceware Commits 2022-02-21 11:49:58 UTC
The master branch has been updated by Andrew Burgess <aburgess@sourceware.org>:

https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;h=336125713fcf9b43960a57724fa39a319036ba38

commit 336125713fcf9b43960a57724fa39a319036ba38
Author: Andrew Burgess <aburgess@redhat.com>
Date:   Sat Feb 19 13:09:34 2022 +0000

    gdb: avoid nullptr access in dbxread.c from read_dbx_symtab
    
    This fixes a GDB crash reported in bug pr/28900, related to reading in
    some stabs debug information.
    
    In this commit my goal is to stop GDB crashing.  I am not trying to
    ensure that GDB makes the best possible use of the available stabs
    debug information.  At this point I consider stabs a legacy debug
    format, with only limited support in GDB.
    
    So, the problem appears to be that, when reading in the stabs data, we
    need to find a N_SO entry, this is the entry that defines the start of
    a compilation unit (or at least the location of a corresponding source
    file).
    
    It is while handling an N_SO that GDB creates a psymtab to hold the
    incoming debug information (symbols, etc).
    
    The problem we hit in the bug is that we encounter some symbol
    information (an N_PC entry) outside of an N_SO entry - that is we find
    some symbol information that is not associated with any source file.
    
    We already have some protection for this case, look (in
    read_dbx_symtab) at the handling of N_PC entries of type 'F' and 'f',
    if we have no psymtab (the pst variable is nullptr) then we issue a
    complaint.  However, for whatever reason, in both 'f' and 'F'
    handling, there is one place where we assume that the pst
    variable (the psymtab) is not nullptr.  This is a mistake.
    
    In this commit, I guard these two locations (in 'f' and 'F' handling)
    so we no longer assume pst is not nullptr.
    
    While I was at it, I audited all the other uses of pst in
    read_dbx_symtab, and in every potentially dangerous case I added a
    nullptr check, and issue a suitable complaint if pst is found to be
    nullptr.
    
    It might well be true that we could/should do something smarter if we
    see a debug symbol outside of an N_SO entry, and if anyone wanted to
    do that work, they're welcome too.  But this commit is just about
    preventing the nullptr access, and the subsequent GDB crash.
    
    I don't have any tests for this change, I have no idea how to generate
    weird stabs data for testing.  The original binary from the bug report
    now loads just fine without GDB crashing.
    
    Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=28900
Comment 11 Andrew Burgess 2022-02-21 12:06:23 UTC
With commit 336125713fc GDB should no longer crash, but this is achieved by just ignoring the symbols stabs data that we don't currently know how to handle.

If you do 'set complaints 1000' before loading the offending binary into GDB, then you get to see GDB complain about all the debug symbols it can't handle.  The output looks like:

  During symbol reading: function `_savegpr_14' appears to be defined outside of all compilation units
  During symbol reading: function `_save32gpr_14' appears to be defined outside of all compilation units
  During symbol reading: function `_savegpr_15' appears to be defined outside of all compilation units
  ... snip ...
  During symbol reading: function `_restgpr_14' appears to be defined outside of all compilation units
  During symbol reading: function `_rest32gpr_14' appears to be defined outside of all compilation units
  During symbol reading: function `_restgpr_15' appears to be defined outside of all compilation units
  ... snip ...

Which look like register save/restore hooks, but I don't claim to really understand.  I don't believe that these symbols have and DWARF information available, so these functions will not show up in something like 'info functions', but there is a symbol for these functions in the symbol table, which GDB does read, so you can do things like 'x/10i _savegpr_14`, however, the symbol table entry has no type or size, so GDB doesn't understand the bounds of the function.

Given the original bug description is for GDB crashing, and this is now fixed.  I'd like to tentatively propose we close this bug.  If Glenn feels like there is still important debug information that isn't being loaded, this might be better represented in a new bug.  Given the lack of GDB developers interested in improving stabs support (right now), I suspect any new bug is not likely to get much attention.  Thoughts?
Comment 12 Tom Tromey 2022-04-29 19:54:53 UTC
> Given the original bug description is for GDB crashing, and this is now
> fixed.  I'd like to tentatively propose we close this bug.  If Glenn feels
> like there is still important debug information that isn't being loaded,
> this might be better represented in a new bug.  Given the lack of GDB
> developers interested in improving stabs support (right now), I suspect any
> new bug is not likely to get much attention.  Thoughts?

+1 to closing it from me.
Given that GCC is removing stabs, and that we don't have any stabs
maintainers or even experts on GDB any more, at some point I'm
going to propose removing the stabs code entirely.