I got this when running the gdb test case "gdb.threads/no-unwaited-for-left" with remote target(gdbserver). Everytime if I "set scheduler-locking on" in gdb and then resume from a thread, the other threads will also be resumed. I think the "on" mode should only resume the current thread.
scheduler-locking does work with gdbserver, otherwise schedlock.exp would fail. ... but only as long as the thread being debugged doesn't exit. This is the "TARGET_WAITKIND_NO_RESUMED / graceful leader-exits" point in http://sourceware.org/gdb/wiki/LocalRemoteFeatureParity The GDB side fix (that added TARGET_WAITKIND_NO_RESUMED and those testcases was <http://sourceware.org/ml/gdb-patches/2011-10/msg00704.html>. When the (only) running thread exits, I think gdbserver ends up resuming the whole inferior...
Sorry, the title really doesn't describe the problem accurately. And thanks Pedro for the wiki page, I missed that doc. In no-unwaited-for-left.exp, if scheduler-locking is set to be "on", then after the running thread exits, GDB server exits instead of the inferior is resumed(my former description is not exact either). I set a break point after the new thread exits and it doesn't execute to that code. I followed the code a bit and found that the last_status of gdbserver is TARGET_WAITKIND_EXITED after the thread exits, then the process is mourned.
Right. As I said, GDBserver ends up resuming all threads of the inferior if the single thread that had been resumed exits. In this case, after the inferior has been resumed, it naturally exits. And when the inferior exits, then GDBserver exits as well; that much is normal. The bit that decides to resume the inferior has this comment, in linux-low.c: /* If we were only supposed to resume one thread, only wait for that thread - if it's still alive. If it died, however - which can happen if we're coming from the thread death case below - then we need to make sure we restart the other threads. We could pick a thread at random or restart all; restarting all is less arbitrary. */ This is of course, totally bogus. The right fix it to add support for the new TARGET_WAITKIND_NO_RESUMED event type to the remote protocol, and teach gdbserver to use it.
The hangs themselves are now fixed on GDBserver mainline. https://sourceware.org/ml/gdb-patches/2014-02/msg00826.html GDBserver's Linux core now knows about TARGET_WAITKIND_NO_RESUMED, but I've left out adding that to the RSP ("[PATCH 5/5] Add TARGET_WAITKIND_NO_RESUMED support to the RSP" at https://sourceware.org/ml/gdb-patches/2014-01/msg00897.html), because I worried about races in non-stop mode. Assume N is the new stop reply for TARGET_WAITKIND_NO_RESUMED, like in the patch above and say, the traffic goes like this: (no thread was resumed before #1) GDB sends two steps in a row. #1 -> vCont;s:555 #2 <- OK (555 exits) #3 <- %Stop:N (no more resumed threads left) #4 -> vCont;s:666 #5 <- OK And now GDB goes and processes #3, and mistakenly believes no thread is left running on the target, while in fact, 666 is left running. Sounds like we need something else in addition to TARGET_WAITKIND_NO_RESUMED, maybe thread exit events (active only for stepped threads, by default, for efficiency). I'll leave this open until this is fully sorted out.
The master branch has been updated by Yao Qi <qiyao@sourceware.org>: https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;h=cafda5977a98aef514ff86daca2fa94205bdd34e commit cafda5977a98aef514ff86daca2fa94205bdd34e Author: Yao Qi <yao.qi@linaro.org> Date: Thu Apr 2 13:51:31 2015 +0100 kfail two tests in no-unwaited-for-left.exp for remote target I see these two fails in no-unwaited-for-left.exp in remote testing for aarch64-linux target. ... continue Continuing. warning: Remote failure reply: E.No unwaited-for children left. [Thread 1084] #2 stopped. (gdb) FAIL: gdb.threads/no-unwaited-for-left.exp: continue stops when thread 2 exits .... continue Continuing. warning: Remote failure reply: E.No unwaited-for children left. [Thread 1081] #1 stopped. (gdb) FAIL: gdb.threads/no-unwaited-for-left.exp: continue stops when the main thread exits I checked the gdb.log on buildbot, and find that these two fails also appear on Debian-i686-native-extended-gdbserver and Fedora-ppc64be-native-gdbserver-m64. I recall that they are about local/remote parity, and related RSP is missing. There has been already a PR 14618 about it. This patch is to kfail them on remote target. gdb/testsuite: 2015-04-02 Yao Qi <yao.qi@linaro.org> * gdb.threads/no-unwaited-for-left.exp: Set up kfail if target is remote.
The master branch has been updated by Pedro Alves <palves@sourceware.org>: https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;h=f4836ba964a96364f39c7eab8b8b2f8656d14d05 commit f4836ba964a96364f39c7eab8b8b2f8656d14d05 Author: Pedro Alves <palves@redhat.com> Date: Mon Nov 30 16:05:24 2015 +0000 infrun: Fix TARGET_WAITKIND_NO_RESUMED handling in non-stop mode Running the testsuite against gdbserver with "maint set target-non-stop on" stumbled on a set of problems. See code comments for details. This handles my concerns expressed in PR14618. gdb/ChangeLog: 2015-11-30 Pedro Alves <palves@redhat.com> PR 14618 * infrun.c (handle_no_resumed): New function. (handle_inferior_event_1) <TARGET_WAITKIND_NO_RESUMED>: Defer to handle_no_resumed.
The master branch has been updated by Pedro Alves <palves@sourceware.org>: https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;h=f2faf941ae49653ff6e1485adfee299313d47c91 commit f2faf941ae49653ff6e1485adfee299313d47c91 Author: Pedro Alves <palves@redhat.com> Date: Mon Nov 30 16:05:25 2015 +0000 Implement TARGET_WAITKIND_NO_RESUMED in the remote protocol Testing with "maint set target-non-stop on" causes regressions in tests that rely on TARGET_WAITKIND_NO_RESUMED, which isn't modelled on the RSP. In real all-stop, gdbserver detects the situation and reporst error to GDB, and so the tests (e.g., gdb.threads/no-unwaited-for-left.exp) at fail quickly. But with "maint set target-non-stop on", GDB instead hangs forever waiting for a stop reply that never comes, and so the tests take longer to time out. This adds a new "N" stop reply packet that maps 1-1 to TARGET_WAITKIND_NO_RESUMED. gdb/ChangeLog: 2015-11-30 Pedro Alves <palves@redhat.com> PR 14618 * NEWS (New remote packets): Mention the N stop reply. * remote.c (remote_protocol_features): Add "no-resumed" entry. (remote_query_supported): Report no-resumed+ support. (remote_parse_stop_reply): Handle 'N'. (process_stop_reply): Handle TARGET_WAITKIND_NO_RESUMED. (remote_wait_as): Handle 'N' / TARGET_WAITKIND_NO_RESUMED. (_initialize_remote): Register "set/show remote no-resumed-stop-reply" commands. gdb/doc/ChangeLog: 2015-11-30 Pedro Alves <palves@redhat.com> PR 14618 * gdb.texinfo (Stop Reply Packets): Document the N stop reply. (Remote Configuration): Add the "set/show remote no-resumed-stop-reply" to the available settings table. (General Query Packets): Document the "no-resumed" qSupported feature. gdb/gdbserver/ChangeLog: 2015-11-30 Pedro Alves <palves@redhat.com> PR 14618 * linux-low.c (linux_wait_1): If the last resumed thread is gone, report TARGET_WAITKIND_NO_RESUMED. * remote-utils.c (prepare_resume_reply): Handle TARGET_WAITKIND_NO_RESUMED. * server.c (report_no_resumed): New global. (handle_query) <qSupported>: Handle "no-resumed+". Report "no-resumed+" support. (resume): When the target reports TARGET_WAITKIND_NO_RESUMED, only return error if the client doesn't support no-resumed events. (push_stop_notification): New function. (handle_target_event): Use it. Report TARGET_WAITKIND_NO_RESUMED events if the client supports them. gdb/testsuite/ChangeLog: 2015-11-30 Pedro Alves <palves@redhat.com> * gdb.threads/no-unwaited-for-left.exp: Remove setup_kfail calls.
This is now fixed. gdb.threads/no-unwaited-for-left.exp passes with gdbserver too now.