Bug 27927 - gdb crash with OpenOCD
Summary: gdb crash with OpenOCD
Status: NEW
Alias: None
Product: gdb
Classification: Unclassified
Component: remote (show other bugs)
Version: HEAD
: P2 normal
Target Milestone: ---
Assignee: Not yet assigned to anyone
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2021-05-28 12:47 UTC by Jérôme Pouiller
Modified: 2023-02-09 21:45 UTC (History)
5 users (show)

See Also:
Host:
Target:
Build:
Last reconfirmed: 2021-06-02 00:00:00
Project(s) to access:
ssh public key:


Attachments
Traffic gdb <-> openocd (2.95 KB, application/vnd.tcpdump.pcap)
2021-05-28 12:47 UTC, Jérôme Pouiller
Details
terminal-output.txt (804 bytes, text/plain)
2021-05-28 12:49 UTC, Jérôme Pouiller
Details
Traffic OpenOCD->gdb (783 bytes, text/plain)
2021-06-02 07:26 UTC, Jérôme Pouiller
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Jérôme Pouiller 2021-05-28 12:47:40 UTC
Created attachment 13472 [details]
Traffic gdb <-> openocd
Comment 1 Jérôme Pouiller 2021-05-28 12:49:17 UTC
Created attachment 13473 [details]
terminal-output.txt
Comment 2 Jérôme Pouiller 2021-05-28 12:52:50 UTC
I use "GNU gdb (Debian 10.1-1.7) 10.1.90.20210103-git" with OpenOCD to debug a Cortex-M4 target. At some point, OpenOCD reach a state where gdb crash just with "target extended-remote localhost:3333".

You will find in attachment a copy of this kind of session. I have also recorded the traffic between gdb and OpenOCD. I have also a coredump, but it is too big for Bugzilla.
Comment 3 Simon Marchi 2021-05-28 13:40:51 UTC
Can you try with GDB master?  There are a few commits that have touched the handling of state remote threads in the last year, so maybe the current situation is different (it might be fixed, or it might still fail, but the error might be different).
Comment 4 Jérôme Pouiller 2021-06-01 08:14:27 UTC
I still have the problem with the commit-id a2cf3633 (01/06/2021). I have compiled gdb with:
   ./configure --prefix=/tmp/gdb --disable-shared --enable-static --target=arm-none-eabi

Beside the line number, the error message is the same:

$ /tmp/gdb/bin/arm-none-eabi-gdb /tmp/foo3/build/debug/foo.out
GNU gdb (GDB) 11.0.50.20210601-git
Copyright (C) 2021 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "--host=x86_64-pc-linux-gnu --target=arm-none-eabi".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<https://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
    <http://www.gnu.org/software/gdb/documentation/>.

For help, type "help".
Type "apropos word" to search for commands related to "word"...
/home/jerome/.gdbinit:37: Error in sourced command file:
~/conf/_gdbinit-gef.py:57: Error in sourced command file:
Undefined command: "from".  Try "help".
Reading symbols from /tmp/foo3/build/debug/foo.out...
(gdb) target extended-remote localhost:3333
Remote debugging using localhost:3333
thread.c:1345: internal-error: void switch_to_thread(thread_info*): Assertion `thr != NULL' failed.
A problem internal to GDB has been detected,
further debugging may prove unreliable.

This is a bug, please report it.  For instructions, see:
<https://www.gnu.org/software/gdb/bugs/>.

[1]    621480 abort (core dumped)  /tmp/gdb/bin/arm-none-eabi-gdb /tmp/foo3/build/debug/foo.out
$
Comment 5 Jérôme Pouiller 2021-06-01 08:21:05 UTC
Note that the problem disappears if I restart OpenOCD.
Comment 6 Jérôme Pouiller 2021-06-01 08:23:51 UTC
I have found an easier way to produce it. Just run `target extended-remote localhost:3333` twice. It crash everytime whatever the status of OpenOCD.
Comment 7 Simon Marchi 2021-06-01 13:44:32 UTC
Ok, thanks.  Is there a way for a random user like me to run OpenOCD to test the GDB stub, but without special hardware?  Like OpenOCD in front of some simulator?
Comment 8 Anna Lucca 2021-06-02 06:22:55 UTC Comment hidden (spam)
Comment 9 Jérôme Pouiller 2021-06-02 07:06:13 UTC
Simon, from the pcap file, it should be possible to write a small python script that would reproduce the bug.
Comment 10 Jérôme Pouiller 2021-06-02 07:26:46 UTC
Created attachment 13480 [details]
Traffic OpenOCD->gdb
Comment 11 Jérôme Pouiller 2021-06-02 07:29:37 UTC
Steps to reproduce:

   - launch "nc -l 3333"
   - launch "gdb" without arguments
   - run "target extended-remote localhost:3333"
   - copy-paste content of file I have attached ("Traffic OpenOCD->gdb") in terminal executing "nc" (note: "nc -l 3333 < openocd-traffic.txt" does not work)
   - gdb crash
Comment 12 Simon Marchi 2021-06-02 13:28:54 UTC
Thanks for the instructions, I can reproduce.

I guess we don't have the same netcat (mine is GNU netcat 0.7.1), as I needed to use "nc -l -p 3333", without the -p I think the "3333" is interpreted as a hostname.

However, I was able to reproduce by piping the file (<) in netcat.

This is the output of "set debug remote 1":

(gdb) tar ext :3333
Remote debugging using :3333
[remote] start_remote: enter
  [remote] Sending packet: $qSupported:multiprocess+;swbreak+;hwbreak+;qRelocInsn+;fork-events+;vfork-events+;exec-events+;vContSupported+;QThreadEvents+;no-resumed+;memory-tagging+;xmlRegisters=i386#77
  [remote] Received Ack
  [remote] Packet received: PacketSize=4000;qXfer:memory-map:read+;qXfer:features:read+;qXfer:threads:read+;QStartNoAckMode+;vContSupported+
  [remote] packet_ok: Packet qSupported (supported-packets) is supported
  [remote] Sending packet: $vMustReplyEmpty#3a
  [remote] Received Ack
  [remote] Packet received: 
  [remote] Sending packet: $QStartNoAckMode#b0
  [remote] Received Ack
  [remote] Packet received: OK
  [remote] Sending packet: $!#21
  [remote] Packet received: OK
  [remote] Sending packet: $Hg0#df
  [remote] Packet received: OK
  [remote] Sending packet: $qXfer:features:read:target.xml:0,1000#0c
  [remote] Packet received: l<?xml version="1.0"?>\n<!DOCTYPE target SYSTEM "gdb-target.dtd">\n<target version="1.0">\n<architecture>arm</architecture>\n<feature name="org.gnu.gdb.arm.m-profile">\n<reg name="r0" bitsize="32" regnum="0" save-restore="yes" type="int" group="general"/>\n<reg name="r1" bitsize="32" regnum="1" save-restore="yes" type="int" group="general"/>\n<reg name="r2" bitsize="32" regnum="2" save-restore="yes" type="int" group="general"/>\n<reg name="r3" bitsize="32" regnum="3" save-restore="yes" type="int" group="general"/>\n [3430 bytes omitted]
  [remote] Sending packet: $qTStatus#49
  [remote] Packet received: 
  [remote] packet_ok: Packet qTStatus (trace-status) is NOT supported
  [remote] Sending packet: $?#3f
  [remote] Packet received: S02
  [remote] Sending packet: $qXfer:threads:read::0,1000#92
  [remote] Packet received: l<?xml version="1.0"?>\n<threads>\n<thread id="1">Name: Kernel's Stat Task, State: Delay, Priority: 6</thread>\n<thread id="2">Name: Wi-SUN MAC, State: Pend, Priority: 31</thread>\n<thread id="3">Name: Wi-SUN Timer Task, State: Pend, Priority: 10</thread>\n<thread id="4">Name: Wi-SUN Event Loop Task, State: Pend, Priority: 12</thread>\n<thread id="5">Name: Wi-SUN RF Task, State: Pend, Priority: 9</thread>\n</threads>\n
  [remote] Sending packet: $qAttached#8f
  [remote] Packet received: 1
  [remote] packet_ok: Packet qAttached (query-attached) is supported
warning: No executable has been specified and target does not support
determining executable automatically.  Try using the "file" command.
  [remote] Sending packet: $Hc-1#09
  [remote] Packet received: OK
  [remote] Sending packet: $qC#b4
  [remote] Packet received: QC0000000000000000
/home/simark/src/binutils-gdb/gdb/thread.c:1345: internal-error: void switch_to_thread(thread_info*): Assertion `thr != NULL' failed.
A problem internal to GDB has been detected,
further debugging may prove unreliable.
Quit this debugging session? (y or n) 

And the backtrace:

#9  0x000055c505df2fe6 in internal_error (file=0x55c5065a3840 "/home/simark/src/binutils-gdb/gdb/thread.c", line=1345, fmt=0x55c5065a3220 "%s: Assertion `%s' failed.") at /home/simark/src/binutils-gdb/gdbsupport/errors.cc:55
#10 0x000055c504a24f70 in switch_to_thread (thr=0x0) at /home/simark/src/binutils-gdb/gdb/thread.c:1345
#11 0x000055c50460c4d2 in remote_target::start_remote (this=0x617000033700, from_tty=1, extended_p=1) at /home/simark/src/binutils-gdb/gdb/remote.c:4860
#12 0x000055c50460ffe8 in remote_target::open_1 (name=0x60200007c338 ":3333", from_tty=1, extended_p=1) at /home/simark/src/binutils-gdb/gdb/remote.c:5771
#13 0x000055c50460ce9d in extended_remote_target::open (name=0x60200007c338 ":3333", from_tty=1) at /home/simark/src/binutils-gdb/gdb/remote.c:5004
#14 0x000055c5049c59fa in open_target (args=0x60200007c338 ":3333", from_tty=1, command=0x611000065300) at /home/simark/src/binutils-gdb/gdb/target.c:847
Comment 13 Robert Jenssen 2022-06-29 13:43:47 UTC
I have been experimenting with a "Poor Man's Profiler" with
openocd-0.11.0, a local build of arm-none-eabi-gdb-12.1 and an
STM32F3Discovery evaluation board. See:
https://poormansprofiler.org
https://interrupt.memfault.com/blog/profiling-firmware-on-cortex-m


My system is:
$ uname -a
Linux morgawr 5.18.6-200.fc36.x86_64 #1 SMP PREEMPT_DYNAMIC Wed Jun 22 13:46:18 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux


The gdb version is:
$ arm-none-eabi-gdb --version
GNU gdb (GDB) 12.1


arm-none-eabi-gdb was built as follows:
$ arm-none-eabi-gdb --configuration 
This GDB was configured as follows:
   configure --host=x86_64-pc-linux-gnu --target=arm-none-eabi
	     --with-auto-load-dir=$debugdir:$datadir/auto-load
	     --with-auto-load-safe-path=$debugdir:$datadir/auto-load
	     --with-expat
	     --with-gdb-datadir=/usr/local/arm-toolchain/share/gdb
(relocatable) --with-jit-reader-dir=/usr/local/arm-toolchain/lib/gdb
(relocatable) --without-libunwind-ia64
	     --with-lzma
	     --without-babeltrace
	     --with-intel-pt
	     --with-mpfr
	     --without-xxhash
	     --with-python=/usr
	     --with-python-libdir=/usr/lib
	     --with-debuginfod
	     --without-guile
	     --disable-source-highlight
	     --with-separate-debug-dir=/usr/local/arm-toolchain/lib/debug
(relocatable)

("Relocatable" means the directory can be moved with the GDB
installation tree, and GDB will still find it.)


The OpenOCD version is:
$ openocd --version
Open On-Chip Debugger 0.11.0


OpenOCD was run as follows:
openocd -c "source [find board/stm32f3discovery.cfg]; \
stm32f3x.cpu configure -rtos auto"



The following shell scrip runs arm-none-eabi-gdb repeatedly:
#!/bin/bash

# See https://poormansprofiler.org/
# Run in another terminal:
# openocd -c "source [find board/stm32f3discovery.cfg];
#             stm32f3x.cpu configure -rtos auto"

nsamples=100
sleeptime=1
elf=bin/imu
for x in $(seq 1 $nsamples); do
    arm-none-eabi-gdb -ex "set pagination off" \
                      -ex "target extended-remote :3333" \
                      -ex "monitor halt" \
                      -ex "thread apply all bt" \
                      -ex "monitor resume" \
                      -batch $elf
    sleep $sleeptime
done | \
awk '
BEGIN { s = ""; }
  /^Thread/ { print s; s = ""; }
  /^#/ {
   a   if (s != "" ) { if ($3 == "in") {  s = s "," $4 } else {  s = s
"," $2 }} else { if ($3 == "in") {  s = $4 } else {  s = $2 } }
  }
END { print s }' | \
sort | uniq -c | sort -r -n -k 1,1



Here is the output from openocd when arm-none-eabi-gdb fails:
.
.
.
Info : accepting 'gdb' connection on tcp/3333
target halted due to debug-request, current mode: Thread 
xPSR: 0x61000000 pc: 0x08013e1c psp: 0x20001a60
Info : dropped 'gdb' connection
Info : accepting 'gdb' connection on tcp/3333
target halted due to debug-request, current mode: Thread 
xPSR: 0x61000000 pc: 0x08013e22 psp: 0x20001a60
Info : dropped 'gdb' connection
Info : accepting 'gdb' connection on tcp/3333
target halted due to debug-request, current mode: Thread 
xPSR: 0x01000000 pc: 0x08014786 psp: 0x20004728
Info : dropped 'gdb' connection
Info : accepting 'gdb' connection on tcp/3333
Info : dropped 'gdb' connection
Info : accepting 'gdb' connection on tcp/3333
Info : dropped 'gdb' connection
Info : accepting 'gdb' connection on tcp/3333
Info : dropped 'gdb' connection
Info : accepting 'gdb' connection on tcp/3333
Info : dropped 'gdb' connection
.
.
.



Before the failure I get messages from arm-none-eabi-gdb like:
warning: multi-threaded target stopped without sending a thread-id,
using first non-exited thread


Here is an example of the repeated output from
arm-none-eabi-gdb after the failure:

doc/poor_mans_profiler.sh: line 11: 159394 Aborted
(core dumped) arm-none-eabi-gdb -ex "set pagination off" -ex "target
extended-remote :3333" -ex "monitor halt" -ex "thread apply all bt" -ex
"monitor resume" -batch $elf ../../gdb-12.1/gdb/thread.c:1328:
internal-error: switch_to_thread: Assertion `thr != NULL' failed. A
problem internal to GDB has been detected, further debugging may prove
unreliable. ----- Backtrace ----- 0x4cd402 gdb_internal_backtrace_1
../../gdb-12.1/gdb/bt-utils.c:122 0x4cd402 _Z22gdb_internal_backtracev
	../../gdb-12.1/gdb/bt-utils.c:168
0x7b6374 internal_vproblem
	../../gdb-12.1/gdb/utils.c:394
0x7b65c8 _Z15internal_verrorPKciS0_P13__va_list_tag
	../../gdb-12.1/gdb/utils.c:471
0x8e9171 _Z14internal_errorPKciS0_z
	../../gdb-12.1/gdbsupport/errors.cc:55
0x776bff _Z16switch_to_threadP11thread_info
	../../gdb-12.1/gdb/thread.c:1328
0x776bff _Z16switch_to_threadP11thread_info
	../../gdb-12.1/gdb/thread.c:1326
0x6f48eb _ZN13remote_target14start_remote_1Eii
	../../gdb-12.1/gdb/remote.c:4938
0x6f4e17 _ZN13remote_target12start_remoteEii
	../../gdb-12.1/gdb/remote.c:5050
0x6f4e17 _ZN13remote_target6open_1EPKcii
	../../gdb-12.1/gdb/remote.c:5856
0x772780 open_target
	../../gdb-12.1/gdb/target.c:853
0x4fe1f4 _Z8cmd_funcP16cmd_list_elementPKci
	../../gdb-12.1/gdb/cli/cli-decode.c:2514
0x77e0da _Z15execute_commandPKci
	../../gdb-12.1/gdb/top.c:702
0x638f21 catch_command_errors
	../../gdb-12.1/gdb/main.c:523
0x638fef execute_cmdargs
	../../gdb-12.1/gdb/main.c:618
0x63ad6c captured_main_1
	../../gdb-12.1/gdb/main.c:1320
0x63b7da captured_main
	../../gdb-12.1/gdb/main.c:1341
0x63b7da _Z8gdb_mainP18captured_main_args
	../../gdb-12.1/gdb/main.c:1366
0x42f1b4 main
	../../gdb-12.1/gdb/gdb.c:32
---------------------

This is a bug, please report it.  For instructions, see:
<https://www.gnu.org/software/gdb/bugs/>.
Comment 14 Tom Tromey 2023-01-17 16:28:20 UTC
(In reply to Simon Marchi from comment #12)

Looking at the trace:

>   [remote] Sending packet: $qXfer:threads:read::0,1000#92
>   [remote] Packet received: l<?xml version="1.0"?>\n<threads>\n<thread
> id="1">Name: Kernel's Stat Task, State: Delay, Priority: 6</thread>\n<thread
> id="2">Name: Wi-SUN MAC, State: Pend, Priority: 31</thread>\n<thread
> id="3">Name: Wi-SUN Timer Task, State: Pend, Priority: 10</thread>\n<thread
> id="4">Name: Wi-SUN Event Loop Task, State: Pend, Priority:
> 12</thread>\n<thread id="5">Name: Wi-SUN RF Task, State: Pend, Priority:
> 9</thread>\n</threads>\n

This shows the threads, they have id 1-5.

>   [remote] Sending packet: $qC#b4
>   [remote] Packet received: QC0000000000000000

But the remote says that the current thread is 0, which seems
sort of nonsensical.  The docs say:

A THREAD-ID can
also be a literal '-1' to indicate all threads, or '0' to pick any
thread.

However, it seems to me that these special values do not make
sense for a QC response.

IOW, I think this is a bug in the remote stub.
Comment 15 Jérôme Pouiller 2023-02-09 14:40:00 UTC
Reported to OpenOCD:
    https://sourceforge.net/p/openocd/tickets/381/
Comment 16 Jérôme Pouiller 2023-02-09 14:43:33 UTC
BTW, I believe gdb shouldn't crash even when the remote is buggy.
Comment 17 Tom Tromey 2023-02-09 21:45:16 UTC
(In reply to Jérôme Pouiller from comment #16)
> BTW, I believe gdb shouldn't crash even when the remote is buggy.

Yeah, totally agreed.