At <https://sourceware.org/ml/gdb-patches/2017-12/msg00285.html>, I wrote: On Tue, 12 Dec 2017, Maciej W. Rozycki wrote: > Another regression. This makes `mips-mti-linux-gnu' GDB stop working > with older stubs. I have discovered it in an attempt to regression-test > said GDB with a remote n64 target and `x86_64-pc-linux-gnu' host, and an > outstanding change which affects non-XML stubs. Any attempt to continue > execution after the initial connection fails with: > > [...] > Process .../gdb/testsuite/outputs/gdb.base/advance/advance created; pid = 2670 > Listening on port 2346 > target remote [...]:2346 > Remote debugging using [...]:2346 > Reading symbols from .../lib64/ld.so.1...done. > [Switching to Thread <main>] > (gdb) continue > Cannot execute this command without a live selected thread. > (gdb) > > This is as from commit 5cd63fda035d. Up to 5cd63fda035d^ instead I get: > > [...] > Process .../gdb/testsuite/outputs/gdb.base/advance/advance created; pid = 30044 > Listening on port 2346 > target remote [...]:2346 > Remote debugging using [...]:2346 > Reading symbols from .../lib64/ld.so.1...done. > 0x000000fff79534f0 in __start () from .../lib64/ld.so.1 > (gdb) continue > Continuing. > warning: Could not load shared library symbols for linux-vdso.so.1. > Do you need "set solib-search-path" or "set sysroot"? > > Breakpoint 1, main () at .../gdb/testsuite/gdb.base/advance.c:41 > 41 c = 5; > (gdb) > > At the protocol level the difference starts here (bad): > > Sending packet: $qfThreadInfo#bb...Ack > Packet received: m1814 > Sending packet: $qsThreadInfo#c8...Ack > Packet received: l > [Switching to Thread <main>] > Sending packet: $qSymbol::#5b...Ack > Packet received: qSymbol:6e70746c5f76657273696f6e > Sending packet: $qSymbol::6e70746c5f76657273696f6e#4d...Ack > Packet received: OK > Sending packet: $Hg1814#7d...Ack > Packet received: OK > Sending packet: $Hg0#df...Ack > Packet received: E01 > (gdb) continue > Cannot execute this command without a live selected thread. > (gdb) > > vs (good): > > Sending packet: $qfThreadInfo#bb...Ack > Packet received: m154d > Sending packet: $qsThreadInfo#c8...Ack > Packet received: l > Sending packet: $mfff726c4f0,4#cb...Ack > Packet received: 03e0c825 > Sending packet: $mfff726c4ec,4#fd...Ack > Packet received: 00000000 > Sending packet: $mfff726c4f0,4#cb...Ack > Packet received: 03e0c825 > 0x000000fff726c4f0 in __start () from .../lib64/ld.so.1 > Sending packet: $qSymbol::#5b...Ack > Packet received: qSymbol:6e70746c5f76657273696f6e > Sending packet: $qSymbol::6e70746c5f76657273696f6e#4d...Ack > Packet received: OK > (gdb) continue > Continuing. > Sending packet: $mfff726c4f0,4#cb...Ack > Packet received: 03e0c825 > Sending packet: $Z0,120000d6c,4#36...Ack > Packet received: > Packet Z0 (software-breakpoint) is NOT supported > [...] > > The version of `gdbserver' causing this regression is as at commit > f8b73d13b7ca^, the last MIPS backend version with no XML support. It's > likely that later versions hit this regression too, as the mode of failure > is clearly not XML-related.
As investigated here: <https://sourceware.org/ml/gdb-patches/2017-12/msg00299.html> the problem comes from the use of a `Hg0' packet which isn't supported by older stubs that support the general `Hg' form. Given that `Hg1814' has been previously issued there's no point in following up with `Hg0'.
The master branch has been updated by Pedro Alves <palves@sourceware.org>: https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;h=3cada74087687907311b52781354ff551e10a0ed commit 3cada74087687907311b52781354ff551e10a0ed Author: Pedro Alves <palves@redhat.com> Date: Thu Jan 11 00:23:04 2018 +0000 Fix backwards compatibility with old GDBservers (PR remote/22597) At <https://sourceware.org/ml/gdb-patches/2017-12/msg00285.html>, Maciej reported that commit: commit 5cd63fda035d4ba949e6478406162c4673b3c9ef Date: Wed Oct 4 18:21:10 2017 +0100 Subject: Fix "Remote 'g' packet reply is too long" problems with multiple inferiors made GDB stop working with older stubs. Any attempt to continue execution after the initial connection fails with: [...] Process .../gdb/testsuite/outputs/gdb.base/advance/advance created; pid = 2670 Listening on port 2346 target remote [...]:2346 Remote debugging using [...]:2346 Reading symbols from .../lib64/ld.so.1...done. [Switching to Thread <main>] (gdb) continue Cannot execute this command without a live selected thread. (gdb) The problem is: (gdb) c Cannot execute this command without a live selected thread. (gdb) info threads Id Target Id Frame 1 Thread 14917 0x00007f341cd98ed0 in _start () from /lib64/ld-linux-x86-64.so.2 The current thread <Thread ID 2> has terminated. See `help thread'. ^^^^^^^^^^^ (gdb) Note, thread _2_. There's really only one thread in the inferior (it's still at the entry point), but still GDB added a bogus second thread. The reason GDB started adding a second thread after 5cd63fda035d is this hunk: + if (event->ptid == null_ptid) + { + const char *thr = strstr (p1 + 1, ";thread:"); + if (thr != NULL) + event->ptid = read_ptid (thr + strlen (";thread:"), + NULL); + else + event->ptid = magic_null_ptid; + } Note the else branch that falls back to magic_null_ptid. We reach that when we process the initial stop reply sent back in response to the the "?" (status) packet early in the connection setup: Sending packet: $?#3f...Ack Packet received: T0506:0000000000000000;07:40a510f4fd7f0000;10:d0fe1201577f0000; And note that that response does not include a ";thread:XXX" part. This stop reply is processed after listing threads with qfThreadInfo / qsThreadInfo : Sending packet: $qfThreadInfo#bb...Ack Packet received: m3915 Sending packet: $qsThreadInfo#c8...Ack Packet received: l meaning, when we process that stop reply, we treat the event as coming from a thread with ptid == magic_null_ptid, which is not yet in the thread list, so we add it then: (top-gdb) p ptid $1 = {m_pid = 42000, m_lwp = -1, m_tid = 1} (top-gdb) bt #0 0x0000000000840a8c in add_thread_silent(ptid_t) (ptid=...) at src/gdb/thread.c:269 #1 0x00000000007ad61d in remote_add_thread(ptid_t, int, int) (ptid=..., running=0, executing=0) at src/gdb/remote.c:1838 #2 0x00000000007ad8de in remote_notice_new_inferior(ptid_t, int) (currthread=..., executing=0) at src/gdb/remote.c:1921 #3 0x00000000007b758b in process_stop_reply(stop_reply*, target_waitstatus*) (stop_reply=0x1158860, status=0x7fffffffcc00) at src/gdb/remote.c:7217 #4 0x00000000007b7a38 in remote_wait_as(ptid_t, target_waitstatus*, int) (ptid=..., status=0x7fffffffcc00, options=0) at src/gdb/remote.c:7380 #5 0x00000000007b7cd1 in remote_wait(target_ops*, ptid_t, target_waitstatus*, int) (ops=0x102fac0 <remote_ops>, ptid=..., status=0x7fffffffcc00, options=0) at src/gdb/remote.c:7446 #6 0x000000000081587b in delegate_wait(target_ops*, ptid_t, target_waitstatus*, int) (self=0x102fac0 <remote_ops>, arg1=..., arg2=0x7fffffffcc00, arg3=0) at src/gdb/target-delegates.c:138 #7 0x0000000000827d77 in target_wait(ptid_t, target_waitstatus*, int) (ptid=..., status=0x7fffffffcc00, options=0) at src/gdb/target.c:2179 #8 0x0000000000715fda in do_target_wait(ptid_t, target_waitstatus*, int) (ptid=..., status=0x7fffffffcc00, options=0) at src/gdb/infrun.c:3589 #9 0x0000000000716351 in wait_for_inferior() () at src/gdb/infrun.c:3707 #10 0x0000000000715435 in start_remote(int) (from_tty=1) at src/gdb/infrun.c:3212 things go downhill from this. We don't see the problem with current master gdbserver, because that version always sends the ";thread:" part in the initial stop reply: Sending packet: $?#3f...Packet received: T0506:0000000000000000;07:a0d4ffffff7f0000;10:d05eddf7ff7f0000;thread:p3cea.3cea;core:3; Years ago I had added a "--disable-packet=" command line option to gdbserver which comes in handy for testing this, since the existing "--disable-packet=Tthread" precisely makes gdbserver not send that ";thread:" part in stop replies. The testcase added by this commit emulates old gdbserver making use of that. I've compared a testrun at 5cd63fda035d^ (before regression) with 'current master+patch', against old gdbserver at f8b73d13b7ca^. I hacked out --once, and "monitor exit" to be able to test. The results are a bit too unstable to tell accurately, but it looked like there were no regressions. Maciej confirmed this worked for him as well. No regressions on master (against master gdbserver). gdb/ChangeLog: 2018-01-11 Pedro Alves <palves@redhat.com> PR remote/22597 * remote.c (remote_parse_stop_reply): Default to the last-set general thread instead of to 'magic_null_ptid'. gdb/testsuite/ChangeLog: 2018-01-11 Pedro Alves <palves@redhat.com> PR remote/22597 * gdb.server/stop-reply-no-thread.c: New file. * gdb.server/stop-reply-no-thread.exp: New file.
The gdb-8.1-branch branch has been updated by Pedro Alves <palves@sourceware.org>: https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;h=936a312c04f9f1dd856571bf7573b5a8f51ed895 commit 936a312c04f9f1dd856571bf7573b5a8f51ed895 Author: Pedro Alves <palves@redhat.com> Date: Thu Jan 11 00:24:59 2018 +0000 Fix backwards compatibility with old GDBservers (PR remote/22597) At <https://sourceware.org/ml/gdb-patches/2017-12/msg00285.html>, Maciej reported that commit: commit 5cd63fda035d4ba949e6478406162c4673b3c9ef Date: Wed Oct 4 18:21:10 2017 +0100 Subject: Fix "Remote 'g' packet reply is too long" problems with multiple inferiors made GDB stop working with older stubs. Any attempt to continue execution after the initial connection fails with: [...] Process .../gdb/testsuite/outputs/gdb.base/advance/advance created; pid = 2670 Listening on port 2346 target remote [...]:2346 Remote debugging using [...]:2346 Reading symbols from .../lib64/ld.so.1...done. [Switching to Thread <main>] (gdb) continue Cannot execute this command without a live selected thread. (gdb) The problem is: (gdb) c Cannot execute this command without a live selected thread. (gdb) info threads Id Target Id Frame 1 Thread 14917 0x00007f341cd98ed0 in _start () from /lib64/ld-linux-x86-64.so.2 The current thread <Thread ID 2> has terminated. See `help thread'. ^^^^^^^^^^^ (gdb) Note, thread _2_. There's really only one thread in the inferior (it's still at the entry point), but still GDB added a bogus second thread. The reason GDB started adding a second thread after 5cd63fda035d is this hunk: + if (event->ptid == null_ptid) + { + const char *thr = strstr (p1 + 1, ";thread:"); + if (thr != NULL) + event->ptid = read_ptid (thr + strlen (";thread:"), + NULL); + else + event->ptid = magic_null_ptid; + } Note the else branch that falls back to magic_null_ptid. We reach that when we process the initial stop reply sent back in response to the the "?" (status) packet early in the connection setup: Sending packet: $?#3f...Ack Packet received: T0506:0000000000000000;07:40a510f4fd7f0000;10:d0fe1201577f0000; And note that that response does not include a ";thread:XXX" part. This stop reply is processed after listing threads with qfThreadInfo / qsThreadInfo : Sending packet: $qfThreadInfo#bb...Ack Packet received: m3915 Sending packet: $qsThreadInfo#c8...Ack Packet received: l meaning, when we process that stop reply, we treat the event as coming from a thread with ptid == magic_null_ptid, which is not yet in the thread list, so we add it then: (top-gdb) p ptid $1 = {m_pid = 42000, m_lwp = -1, m_tid = 1} (top-gdb) bt #0 0x0000000000840a8c in add_thread_silent(ptid_t) (ptid=...) at src/gdb/thread.c:269 #1 0x00000000007ad61d in remote_add_thread(ptid_t, int, int) (ptid=..., running=0, executing=0) at src/gdb/remote.c:1838 #2 0x00000000007ad8de in remote_notice_new_inferior(ptid_t, int) (currthread=..., executing=0) at src/gdb/remote.c:1921 #3 0x00000000007b758b in process_stop_reply(stop_reply*, target_waitstatus*) (stop_reply=0x1158860, status=0x7fffffffcc00) at src/gdb/remote.c:7217 #4 0x00000000007b7a38 in remote_wait_as(ptid_t, target_waitstatus*, int) (ptid=..., status=0x7fffffffcc00, options=0) at src/gdb/remote.c:7380 #5 0x00000000007b7cd1 in remote_wait(target_ops*, ptid_t, target_waitstatus*, int) (ops=0x102fac0 <remote_ops>, ptid=..., status=0x7fffffffcc00, options=0) at src/gdb/remote.c:7446 #6 0x000000000081587b in delegate_wait(target_ops*, ptid_t, target_waitstatus*, int) (self=0x102fac0 <remote_ops>, arg1=..., arg2=0x7fffffffcc00, arg3=0) at src/gdb/target-delegates.c:138 #7 0x0000000000827d77 in target_wait(ptid_t, target_waitstatus*, int) (ptid=..., status=0x7fffffffcc00, options=0) at src/gdb/target.c:2179 #8 0x0000000000715fda in do_target_wait(ptid_t, target_waitstatus*, int) (ptid=..., status=0x7fffffffcc00, options=0) at src/gdb/infrun.c:3589 #9 0x0000000000716351 in wait_for_inferior() () at src/gdb/infrun.c:3707 #10 0x0000000000715435 in start_remote(int) (from_tty=1) at src/gdb/infrun.c:3212 things go downhill from this. We don't see the problem with current master gdbserver, because that version always sends the ";thread:" part in the initial stop reply: Sending packet: $?#3f...Packet received: T0506:0000000000000000;07:a0d4ffffff7f0000;10:d05eddf7ff7f0000;thread:p3cea.3cea;core:3; Years ago I had added a "--disable-packet=" command line option to gdbserver which comes in handy for testing this, since the existing "--disable-packet=Tthread" precisely makes gdbserver not send that ";thread:" part in stop replies. The testcase added by this commit emulates old gdbserver making use of that. I've compared a testrun at 5cd63fda035d^ (before regression) with 'current master+patch', against old gdbserver at f8b73d13b7ca^. I hacked out --once, and "monitor exit" to be able to test. The results are a bit too unstable to tell accurately, but it looked like there were no regressions. Maciej confirmed this worked for him as well. No regressions on master (against master gdbserver). gdb/ChangeLog: 2018-01-11 Pedro Alves <palves@redhat.com> PR remote/22597 * remote.c (remote_parse_stop_reply): Default to the last-set general thread instead of to 'magic_null_ptid'. gdb/testsuite/ChangeLog: 2018-01-11 Pedro Alves <palves@redhat.com> PR remote/22597 * gdb.server/stop-reply-no-thread.c: New file. * gdb.server/stop-reply-no-thread.exp: New file.
Fix merged.
The master branch has been updated by Andrew Burgess <aburgess@sourceware.org>: https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;h=24ed6739b699f329c2c45aedee5f8c7d2f54e493 commit 24ed6739b699f329c2c45aedee5f8c7d2f54e493 Author: Andrew Burgess <andrew.burgess@embecosm.com> Date: Thu Jan 30 14:35:40 2020 +0000 gdb/remote: Restore support for 'S' stop reply packet With this commit: commit 5b6d1e4fa4fc6827c7b3f0e99ff120dfa14d65d2 Date: Fri Jan 10 20:06:08 2020 +0000 Multi-target support There was a regression in GDB's support for older aspects of the remote protocol. Specifically, when a target sends the 'S' stop reply packet (which doesn't include a thread-id) then GDB has to figure out which thread actually stopped. Before the above commit GDB figured this out by using inferior_ptid in process_stop_reply, which contained the ptid of the current process/thread. This would be fine for single threaded targets (which is the only place using an S packet makes sense), but in the general case, relying on inferior_ptid for processing a stop is wrong - there's no reason to believe that what was GDB's current thread will be the same thread that just stopped in the target. With the above commit the inferior_ptid now has the value null_ptid inside process_stop_reply, this can be seen in do_target_wait, where we call switch_to_inferior_no_thread before calling do_target_wait_1. The problem this causes can be seen in the new test that runs gdbserver using the flag --disable-packet=T, and causes GDB to throw this assertion: inferior.c:279: internal-error: inferior* find_inferior_pid(process_stratum_target*, int): Assertion `pid != 0' failed. A similar problem was fixed in this commit: commit 3cada74087687907311b52781354ff551e10a0ed Date: Thu Jan 11 00:23:04 2018 +0000 Fix backwards compatibility with old GDBservers (PR remote/22597) However, this commit deals with the case where the T packet doesn't include a thread-id, not the S packet case. This commit solves the problem providing a thread-id at the GDB side if the remote target doesn't provide one. The thread-id provided comes from remote_state::general_thread, however, though this does work, I don't think it is the ideal solution. The remote_state tracks two threads, the continue_thread and the general_thread, these are updated when GDB asks the remote target to switch threads. The general_thread is set before performing things like register or memory accesses, and the continue_thread is set before things like continue or step commands. Further, the general_thread is updated after a target stops to reference the thread that stopped. The first thing to note from the above description is that we have a cycle of dependency, when a T packet arrives without a thread-id we fill in the thread-id from the general_thread data. The thread-id from the stop event is then used to set the general_thread. This in itself feels a little weird. The second question is why use the general_thread at all? You'd think given how they are originally set that the continue thread would be a better choice. The problem with this is that the continue_thread, if the user just does "continue", will be set to the minus_one_ptid, in the remote protocol this means all threads. When the stop arrives with no thread-id and we use continue_thread we end up with a very similar assertion to before because we now end up trying to lookup a thread using the minus_one_ptid. By contrast, once GDB has connected to a remote target the general_thread will be set to a valid thread-id, after which, if the target is single threaded, and stop events arrive without a thread-id, everything works fine. There is one slight weirdness with the above behaviour though. When GDB first connects to the remote target inferior_ptid is null_ptid, however, upon connecting we query the remote for its threads. As the thread information arrives GDB adds the threads to its internal database, and this process involves setting inferior_ptid to the id of each new thread in turn. Once we know about all the threads we wait for a stop event from the remote target to indicate that GDB is now in control of the target. The problem is that after adding the new threads we don't reset inferior_ptid, and the code path we use to wait for a stop event from the target also doesn't reset inferior_ptid, so it turns out that during the initial connection inferior_ptid is not null_ptid. This is lucky, because during the initial connection the general_thread variable _is_ set to null_ptid. So, during the initial connection, if the first stop event is missing a thread-id then we "provide" a thead-id from general_thread. This turns out to be null_ptid meaning no thread-id is known, and then during process_stop_reply we fill in the missing thread-id using inferior_ptid. This was all discussed on the mailing list here: https://sourceware.org/ml/gdb-patches/2020-02/msg01011.html My proposal for a fix then is: 1. Move the call to switch_to_inferior_no_thread into do_target_wait_1, this means that in all cases where we are waiting for an inferior the inferior_ptid will be set to null_ptid. This is good as no wait code should rely on inferior_ptid. 2. Remove the use of general_thread from the 'T' packet processing. The general_thread read here was only ever correct by chance, and we shouldn't be using it this way. 3. Remove use of inferior_ptid from process_stop_event as this is wrong, and will always be null_ptid now anyway. 4. When a stop_event has null_ptid due to a lack of thread-id (either from a T packet or an S packet) then pick the first non exited thread in the target and use that. This will be fine for single threaded targets. A multi-thread or multi-inferior aware remote target should be using T packets with a thread-id, so we give a warning if the target is multi-threaded, and we are still missing a thread-id. 5. Extend the existing test that covered the T packet with missing thread-id to also cover the S packet. gdb/ChangeLog: * remote.c (remote_target::remote_parse_stop_reply): Don't use the general_thread if the stop reply is missing a thread-id. (remote_target::process_stop_reply): Use the first non-exited thread if the target didn't pass a thread-id. * infrun.c (do_target_wait): Move call to switch_to_inferior_no_thread to .... (do_target_wait_1): ... here. gdb/testsuite/ChangeLog: * gdb.server/stop-reply-no-thread.exp: Add test where T packet is disabled.