This is the mail archive of the gdb-prs@sourceware.org mailing list for the GDB project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[Bug threads/20743] can't usefully "continue" due to "ptrace: No such process" after gdb switches thread (gdb7.11.1 on FreeBSD 11)


https://sourceware.org/bugzilla/show_bug.cgi?id=20743

--- Comment #1 from misc-sourceware at talk2dom dot com ---
[also notifying FreeBSD port maintainer of this bug]

The crux of the issue seems to be resume_all_threads_cb() in fbsd-nat.c trying
to resume a thread that has exited. This causes ptrace(PT_RESUME) to fail with
"no such process". (As a side-note, it doesn't matter which thread is current
before "continue" command as gdb seems to switch to any new thread spawned -
why is that?)

Exited threads are still in the thread list when resume_all_threads_cb() is
called, e.g. if the current thread (in inferior_ptid) exits.

To demonstrate this, change resume_all_threads_cb() to add debugging as follows
so it shows which thread it's about to resume and to confirm which call to
ptrace() returns an error:

static int
resume_all_threads_cb (struct thread_info *tp, void *data)
{
  ptid_t *filter = (ptid_t *) data;

  if (!ptid_match (tp->ptid, *filter))
    return 0;

  if (debug_fbsd_lwp)
    fprintf_unfiltered (gdb_stdlog,
                        "FLWP: PT_RESUME for ptid (%d, %ld, %ld)\n",
                        ptid_get_pid (tp->ptid), ptid_get_lwp (tp->ptid),
                        ptid_get_tid (tp->ptid));


  if (ptrace (PT_RESUME, ptid_get_lwp (tp->ptid), NULL, 0) == -1)
    perror_with_name (("ptrace PT_RESUME"));
  return 0;
}


Now the debugging output looks like this:

(gdb) set debug infrun 3
(gdb) set debug fbsd-lwp on
(gdb) c
Continuing.
infrun: clear_proceed_status_thread (LWP 101201 of process 35559)
infrun: proceed (addr=0xffffffffffffffff, signal=GDB_SIGNAL_DEFAULT)
infrun: resume (step=0, signal=GDB_SIGNAL_0), trap_expected=0, current thread
[LWP 101201 of process 35559] at 0x8032880da
FLWP: fbsd_resume for ptid (-1, 0, 0)
FLWP: PT_RESUME for ptid (35559, 101201, 0)
infrun: prepare_to_wait
FLWP: adding thread for LWP 101576
[New LWP 101576 of process 35559]
infrun: target_wait (-1.0.0, status) =
infrun:   35559.101576.0 [LWP 101576 of process 35559],
infrun:   status->kind = spurious
infrun: TARGET_WAITKIND_SPURIOUS
infrun: Switching context from LWP 101201 of process 35559 to LWP 101576 of
process 35559
infrun: resume (step=0, signal=GDB_SIGNAL_0), trap_expected=0, current thread
[LWP 101576 of process 35559] at 0x80265aa10
FLWP: fbsd_resume for ptid (-1, 0, 0)
FLWP: PT_RESUME for ptid (35559, 101201, 0)
FLWP: PT_RESUME for ptid (35559, 101576, 0)
infrun: prepare_to_wait
FLWP: deleting thread for LWP 101576
[LWP 101576 of process 35559 exited]
FLWP: adding thread for LWP 101586
[New LWP 101586 of process 35559]
infrun: target_wait (-1.0.0, status) =
infrun:   35559.101586.0 [LWP 101586 of process 35559],
infrun:   status->kind = spurious
infrun: TARGET_WAITKIND_SPURIOUS
infrun: Switching context from LWP 101576 of process 35559 to LWP 101586 of
process 35559
infrun: resume (step=0, signal=GDB_SIGNAL_0), trap_expected=0, current thread
[LWP 101586 of process 35559] at 0x80265aa10
FLWP: fbsd_resume for ptid (-1, 0, 0)
FLWP: PT_RESUME for ptid (35559, 101201, 0)
FLWP: PT_RESUME for ptid (35559, 101576, 0)
[Switching to LWP 101586 of process 35559]
0x000000080265aa10 in ?? () from /lib/libthr.so.3
ptrace PT_RESUME: No such process.
(gdb) 

Note the last few lines showing a call for LWP 101576 - a thread that has
exited.


This may not be the ideal fix but as a work-around change the top of
resume_all_threads_cb() to:

resume_all_threads_cb (struct thread_info *tp, void *data)
{
  ptid_t *filter = (ptid_t *) data;

  /* don't resume an exited thread */
  if (tp->state == THREAD_EXITED)
    return 0;

[existing code, starting with if() continues from here]

Output showing issue is worked-around:

(gdb) set debug infrun 3
(gdb) set debug fbsd-lwp on
(gdb) c
Continuing.
infrun: clear_proceed_status_thread (LWP 101201 of process 35559)
infrun: proceed (addr=0xffffffffffffffff, signal=GDB_SIGNAL_DEFAULT)
infrun: resume (step=0, signal=GDB_SIGNAL_0), trap_expected=0, current thread
[LWP 101201 of process 35559] at 0x8032880da
FLWP: fbsd_resume for ptid (-1, 0, 0)
FLWP: PT_RESUME for ptid (35559, 101201, 0)
infrun: prepare_to_wait
FLWP: adding thread for LWP 100444
[New LWP 100444 of process 35559]
infrun: target_wait (-1.0.0, status) =
infrun:   35559.100444.0 [LWP 100444 of process 35559],
infrun:   status->kind = spurious
infrun: TARGET_WAITKIND_SPURIOUS
infrun: Switching context from LWP 101201 of process 35559 to LWP 100444 of
process 35559
infrun: resume (step=0, signal=GDB_SIGNAL_0), trap_expected=0, current thread
[LWP 100444 of process 35559] at 0x80265aa10
FLWP: fbsd_resume for ptid (-1, 0, 0)
FLWP: PT_RESUME for ptid (35559, 101201, 0)
FLWP: PT_RESUME for ptid (35559, 100444, 0)
infrun: prepare_to_wait
FLWP: deleting thread for LWP 100444
[LWP 100444 of process 35559 exited]
FLWP: adding thread for LWP 101642
[New LWP 101642 of process 35559]
infrun: target_wait (-1.0.0, status) =
infrun:   35559.101642.0 [LWP 101642 of process 35559],
infrun:   status->kind = spurious
infrun: TARGET_WAITKIND_SPURIOUS
infrun: Switching context from LWP 100444 of process 35559 to LWP 101642 of
process 35559
infrun: resume (step=0, signal=GDB_SIGNAL_0), trap_expected=0, current thread
[LWP 101642 of process 35559] at 0x80265aa10
FLWP: fbsd_resume for ptid (-1, 0, 0)
FLWP: PT_RESUME for ptid (35559, 101201, 0)
FLWP: PT_RESUME for ptid (35559, 101642, 0)
[...and so on...]

-- 
You are receiving this mail because:
You are on the CC list for the bug.

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]