FreeBSD kitten 11.0-RELEASE-p1 FreeBSD 11.0-RELEASE-p1 #0 r306420: Thu Sep 29 01:43:23 UTC 2016 root@releng2.nyi.freebsd.org:/usr/obj/usr/src/sys/GENERIC amd64 GNU gdb (GDB) 7.11.1 [GDB v7.11.1 for FreeBSD] Attach to a running, multi-threaded process using --pid Set breakpoint Continue (expecting process to run until breakpoint hit) For some reason gdb stops with "ptrace: no such process" Continue (again) Repeat (gdb) inf thr Id Target Id Frame * 1 LWP 100638 of process 22406 0x000000080326e0da in _poll () from /lib/libc.so.7 (gdb) b blockchain_monitor.cpp:187 Breakpoint 1 at 0x4069df: file blockchain_monitor.cpp, line 187. (gdb) c Continuing. [New LWP 101613 of process 22406] [LWP 101613 of process 22406 exited] [New LWP 101614 of process 22406] [Switching to LWP 101614 of process 22406] 0x0000000802640a10 in ?? () from /lib/libthr.so.3 ptrace: No such process. (gdb) inf thr Id Target Id Frame 1 LWP 100638 of process 22406 0x000000080326e0da in _poll () from /lib/libc.so.7 * 3 LWP 101614 of process 22406 0x0000000802640a10 in ?? () from /lib/libthr.so.3 (gdb) bt #0 0x0000000802640a10 in ?? () from /lib/libthr.so.3 #1 0x00007fffdf9fc000 in ?? () Backtrace stopped: Cannot access memory at address 0x7fffdfbfc000 (gdb) c Continuing. [LWP 101614 of process 22406 exited] [New LWP 101028 of process 22406] [Switching to LWP 101028 of process 22406] 0x0000000802640a10 in ?? () from /lib/libthr.so.3 ptrace: No such process. (gdb) c Continuing. [LWP 101028 of process 22406 exited] [New LWP 100112 of process 22406] [Switching to LWP 100112 of process 22406] 0x0000000802640a10 in ?? () from /lib/libthr.so.3 ptrace: No such process. (gdb) [...and so on...] Here's a more detailed continue using the same process as above: (gdb) set debug fbsd-lwp on (gdb) c Continuing. FLWP: fbsd_resume for ptid (-1, 0, 0) FLWP: deleting thread for LWP 100482 [LWP 100482 of process 22406 exited] FLWP: adding thread for LWP 100841 [New LWP 100841 of process 22406] FLWP: fbsd_resume for ptid (-1, 0, 0) [Switching to LWP 100841 of process 22406] 0x0000000802640a10 in ?? () from /lib/libthr.so.3 ptrace: No such process. (gdb) set debug infrun 1 (gdb) c Continuing. infrun: clear_proceed_status_thread (LWP 100638 of process 22406) infrun: clear_proceed_status_thread (LWP 100841 of process 22406) infrun: proceed (addr=0xffffffffffffffff, signal=GDB_SIGNAL_DEFAULT) infrun: resume (step=0, signal=GDB_SIGNAL_0), trap_expected=0, current thread [LWP 100841 of process 22406] at 0x802640a10 FLWP: fbsd_resume for ptid (-1, 0, 0) infrun: prepare_to_wait FLWP: deleting thread for LWP 100841 [LWP 100841 of process 22406 exited] FLWP: adding thread for LWP 100499 [New LWP 100499 of process 22406] infrun: target_wait (-1.0.0, status) = infrun: 22406.100499.0 [LWP 100499 of process 22406], infrun: status->kind = spurious infrun: TARGET_WAITKIND_SPURIOUS infrun: Switching context from LWP 100841 of process 22406 to LWP 100499 of process 22406 infrun: resume (step=0, signal=GDB_SIGNAL_0), trap_expected=0, current thread [LWP 100499 of process 22406] at 0x802640a10 FLWP: fbsd_resume for ptid (-1, 0, 0) [Switching to LWP 100499 of process 22406] 0x0000000802640a10 in ?? () from /lib/libthr.so.3 ptrace: No such process. (gdb)
[also notifying FreeBSD port maintainer of this bug] The crux of the issue seems to be resume_all_threads_cb() in fbsd-nat.c trying to resume a thread that has exited. This causes ptrace(PT_RESUME) to fail with "no such process". (As a side-note, it doesn't matter which thread is current before "continue" command as gdb seems to switch to any new thread spawned - why is that?) Exited threads are still in the thread list when resume_all_threads_cb() is called, e.g. if the current thread (in inferior_ptid) exits. To demonstrate this, change resume_all_threads_cb() to add debugging as follows so it shows which thread it's about to resume and to confirm which call to ptrace() returns an error: static int resume_all_threads_cb (struct thread_info *tp, void *data) { ptid_t *filter = (ptid_t *) data; if (!ptid_match (tp->ptid, *filter)) return 0; if (debug_fbsd_lwp) fprintf_unfiltered (gdb_stdlog, "FLWP: PT_RESUME for ptid (%d, %ld, %ld)\n", ptid_get_pid (tp->ptid), ptid_get_lwp (tp->ptid), ptid_get_tid (tp->ptid)); if (ptrace (PT_RESUME, ptid_get_lwp (tp->ptid), NULL, 0) == -1) perror_with_name (("ptrace PT_RESUME")); return 0; } Now the debugging output looks like this: (gdb) set debug infrun 3 (gdb) set debug fbsd-lwp on (gdb) c Continuing. infrun: clear_proceed_status_thread (LWP 101201 of process 35559) infrun: proceed (addr=0xffffffffffffffff, signal=GDB_SIGNAL_DEFAULT) infrun: resume (step=0, signal=GDB_SIGNAL_0), trap_expected=0, current thread [LWP 101201 of process 35559] at 0x8032880da FLWP: fbsd_resume for ptid (-1, 0, 0) FLWP: PT_RESUME for ptid (35559, 101201, 0) infrun: prepare_to_wait FLWP: adding thread for LWP 101576 [New LWP 101576 of process 35559] infrun: target_wait (-1.0.0, status) = infrun: 35559.101576.0 [LWP 101576 of process 35559], infrun: status->kind = spurious infrun: TARGET_WAITKIND_SPURIOUS infrun: Switching context from LWP 101201 of process 35559 to LWP 101576 of process 35559 infrun: resume (step=0, signal=GDB_SIGNAL_0), trap_expected=0, current thread [LWP 101576 of process 35559] at 0x80265aa10 FLWP: fbsd_resume for ptid (-1, 0, 0) FLWP: PT_RESUME for ptid (35559, 101201, 0) FLWP: PT_RESUME for ptid (35559, 101576, 0) infrun: prepare_to_wait FLWP: deleting thread for LWP 101576 [LWP 101576 of process 35559 exited] FLWP: adding thread for LWP 101586 [New LWP 101586 of process 35559] infrun: target_wait (-1.0.0, status) = infrun: 35559.101586.0 [LWP 101586 of process 35559], infrun: status->kind = spurious infrun: TARGET_WAITKIND_SPURIOUS infrun: Switching context from LWP 101576 of process 35559 to LWP 101586 of process 35559 infrun: resume (step=0, signal=GDB_SIGNAL_0), trap_expected=0, current thread [LWP 101586 of process 35559] at 0x80265aa10 FLWP: fbsd_resume for ptid (-1, 0, 0) FLWP: PT_RESUME for ptid (35559, 101201, 0) FLWP: PT_RESUME for ptid (35559, 101576, 0) [Switching to LWP 101586 of process 35559] 0x000000080265aa10 in ?? () from /lib/libthr.so.3 ptrace PT_RESUME: No such process. (gdb) Note the last few lines showing a call for LWP 101576 - a thread that has exited. This may not be the ideal fix but as a work-around change the top of resume_all_threads_cb() to: resume_all_threads_cb (struct thread_info *tp, void *data) { ptid_t *filter = (ptid_t *) data; /* don't resume an exited thread */ if (tp->state == THREAD_EXITED) return 0; [existing code, starting with if() continues from here] Output showing issue is worked-around: (gdb) set debug infrun 3 (gdb) set debug fbsd-lwp on (gdb) c Continuing. infrun: clear_proceed_status_thread (LWP 101201 of process 35559) infrun: proceed (addr=0xffffffffffffffff, signal=GDB_SIGNAL_DEFAULT) infrun: resume (step=0, signal=GDB_SIGNAL_0), trap_expected=0, current thread [LWP 101201 of process 35559] at 0x8032880da FLWP: fbsd_resume for ptid (-1, 0, 0) FLWP: PT_RESUME for ptid (35559, 101201, 0) infrun: prepare_to_wait FLWP: adding thread for LWP 100444 [New LWP 100444 of process 35559] infrun: target_wait (-1.0.0, status) = infrun: 35559.100444.0 [LWP 100444 of process 35559], infrun: status->kind = spurious infrun: TARGET_WAITKIND_SPURIOUS infrun: Switching context from LWP 101201 of process 35559 to LWP 100444 of process 35559 infrun: resume (step=0, signal=GDB_SIGNAL_0), trap_expected=0, current thread [LWP 100444 of process 35559] at 0x80265aa10 FLWP: fbsd_resume for ptid (-1, 0, 0) FLWP: PT_RESUME for ptid (35559, 101201, 0) FLWP: PT_RESUME for ptid (35559, 100444, 0) infrun: prepare_to_wait FLWP: deleting thread for LWP 100444 [LWP 100444 of process 35559 exited] FLWP: adding thread for LWP 101642 [New LWP 101642 of process 35559] infrun: target_wait (-1.0.0, status) = infrun: 35559.101642.0 [LWP 101642 of process 35559], infrun: status->kind = spurious infrun: TARGET_WAITKIND_SPURIOUS infrun: Switching context from LWP 100444 of process 35559 to LWP 101642 of process 35559 infrun: resume (step=0, signal=GDB_SIGNAL_0), trap_expected=0, current thread [LWP 101642 of process 35559] at 0x80265aa10 FLWP: fbsd_resume for ptid (-1, 0, 0) FLWP: PT_RESUME for ptid (35559, 101201, 0) FLWP: PT_RESUME for ptid (35559, 101642, 0) [...and so on...]
I can confirm this, and I have a similar patch (but using is_exited(), and patching both of the callbacks, but asserting that we never try to do a single-resume of an exited thread). The reason gdb switches to new threads when they are created is that we always report a stop when a new thread arrives. I could change this to have it only add the thread but not report a stop, but it gets kind of messy if you are single-stepping across thread creation as in theory I would need to cache that info down in the fbsd nat layer and PT_SUSPEND the new thread before doing my own PT_CONTINUE.
The master branch has been updated by John Baldwin <jhb@sourceware.org>: https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;h=d56060f08aa4ed5786042a066f62aa8e474cc0fd commit d56060f08aa4ed5786042a066f62aa8e474cc0fd Author: John Baldwin <jhb@FreeBSD.org> Date: Tue Apr 18 09:44:32 2017 -0700 PR threads/20743: Don't attempt to suspend or resume exited threads. When resuming a native FreeBSD process, ignore exited threads when suspending/resuming individual threads prior to continuing the process. gdb/ChangeLog: PR threads/20743 * fbsd-nat.c (resume_one_thread_cb): Remove. (resume_all_threads_cb): Remove. (fbsd_resume): Use ALL_NON_EXITED_THREADS instead of iterate_over_threads.
The gdb-8.0-branch branch has been updated by John Baldwin <jhb@sourceware.org>: https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;h=24b03ea864424cf8482ba07fb074389aa759e592 commit 24b03ea864424cf8482ba07fb074389aa759e592 Author: John Baldwin <jhb@FreeBSD.org> Date: Tue Apr 18 09:44:32 2017 -0700 PR threads/20743: Don't attempt to suspend or resume exited threads. When resuming a native FreeBSD process, ignore exited threads when suspending/resuming individual threads prior to continuing the process. gdb/ChangeLog: PR threads/20743 * fbsd-nat.c (resume_one_thread_cb): Remove. (resume_all_threads_cb): Remove. (fbsd_resume): Use ALL_NON_EXITED_THREADS instead of iterate_over_threads.
Fix committed to master and the 8.0 branch and will appear in 8.0 release.