This is the mail archive of the
gdb-patches@sourceware.org
mailing list for the GDB project.
Re: [PATCH/7.10 2/2] gdbserver: Fix non-stop / fork / step-over issues
- From: Yao Qi <qiyaoltc at gmail dot com>
- To: Pedro Alves <palves at redhat dot com>
- Cc: Yao Qi <qiyaoltc at gmail dot com>, gdb-patches at sourceware dot org
- Date: Wed, 05 Aug 2015 12:41:33 +0100
- Subject: Re: [PATCH/7.10 2/2] gdbserver: Fix non-stop / fork / step-over issues
- Authentication-results: sourceware.org; auth=none
- References: <1438362229-27653-1-git-send-email-palves at redhat dot com> <1438362229-27653-3-git-send-email-palves at redhat dot com> <86a8u8e754 dot fsf at gmail dot com> <55BF94BC dot 6090904 at redhat dot com> <55C0EB12 dot 9050209 at redhat dot com>
Pedro Alves <palves@redhat.com> writes:
> I've now extended the test to run in a few different modes, along
> a couple different axis. One axis is with/without conditional
> breakpoints on the target enabled. That exposes the same fails you
> saw on arm, on x86 gdbserver as well. And then the other axis is
> with/without joining _all_ threads before exiting. If we gracefully
> terminate the breakpoint thread (new mode), then the test should be
> passing everywhere, because what fails is gdb's handling of the inferior
> disappearing while a thread is stopped (and being inspected).
> Therefore that new mode is not kfailed.
>
> For testing convenience, I've pushed this along with the previous patch
> to the users/palves/gdbserver-fork-issues branch on sourceware.org.
> Let me know if this works for you.
Thanks, Pedro. There are no fails on arm-linux with GDBserver. Some
comments on your patch below,
> @@ -3089,8 +3142,25 @@ linux_wait_1 (ptid_t ptid,
> info_p = &info;
> else
> info_p = NULL;
> - linux_resume_one_lwp (event_child, event_child->stepping,
> - WSTOPSIG (w), info_p);
> +
> + if (step_over_finished)
> + {
> + /* We cancelled this thread's step-over above. We still
> + need to unsuspend all other LWPs, and set then back
s/set then/set them/?
> + running again while the signal handler runs. */
> + unsuspend_all_lwps (event_child);
> +
> + /* Enqueue the pending signal info so that proceed_all_lwps
> + doesn't lose it. */
> + enqueue_pending_signal (event_child, WSTOPSIG (w), info_p);
> +
> + proceed_all_lwps ();
> + }
> + else
> + {
> + linux_resume_one_lwp (event_child, event_child->stepping,
> + WSTOPSIG (w), info_p);
> + }
> return ignore_event (ourstatus);
> }
>
> @@ -3111,8 +3181,15 @@ linux_wait_1 (ptid_t ptid,
> || (current_thread->last_resume_kind == resume_step
> && !in_step_range)
> || event_child->stop_reason == TARGET_STOPPED_BY_WATCHPOINT
> - || (!step_over_finished && !in_step_range
> - && !bp_explains_trap && !trace_event)
> + || (!in_step_range
> + && !bp_explains_trap
> + && !trace_event
> + /* A step-over was finished just now? */
> + && !step_over_finished
> + /* A step-over had been finished previously,
> + and the single-step was left pending? */
> + && !(current_thread->last_resume_kind == resume_continue
> + && event_child->stop_reason == TARGET_STOPPED_BY_SINGLE_STEP))
> || (gdb_breakpoint_here (event_child->stop_pc)
I don't fully understand this, what is a case that "step-over had been
finished previously, but the single-step was left pending"?
> (linux_wait_1): If passing a signal to the inferior after
> finishing a step-over, unsuspend and re-resume all lwps. If we
> see a single-step event but the thread should be continuing, don't
> pass the trap to gdb.
however, the explanations in ChangeLog look more reasonable.
> && gdb_condition_true_at_breakpoint (event_child->stop_pc)
> && gdb_no_commands_at_breakpoint (event_child->stop_pc))
> @@ -4189,16 +4298,36 @@ static int
> start_step_over (struct lwp_info *lwp)
> {
> struct thread_info *thread = get_lwp_thread (lwp);
> + ptid_t thread_ptid;
> struct thread_info *saved_thread;
> CORE_ADDR pc;
> int step;
>
> + thread_ptid = ptid_of (thread);
> +
> if (debug_threads)
> debug_printf ("Starting step-over on LWP %ld. Stopping all threads\n",
> lwpid_of (thread));
>
> stop_all_lwps (1, lwp);
> - gdb_assert (lwp->suspended == 0);
> +
> + /* Re-find the LWP as it may have exited. */
> + lwp = find_lwp_pid (thread_ptid);
> + if (lwp == NULL || lwp_is_marked_dead (lwp))
> + {
> + if (debug_threads)
> + debug_printf ("Step-over thread died "
> + "(another thread exited the process?).\n");
> + unstop_all_lwps (1, lwp);
> + return 0;
> + }
> +
> + if (lwp->suspended != 0)
> + {
> + internal_error (__FILE__, __LINE__,
> + "LWP %ld suspended=%d\n", lwpid_of (thread),
> + lwp->suspended);
> + }
>
> if (debug_threads)
> debug_printf ("Done stopping all threads for step-over.\n");
> @@ -4229,7 +4358,19 @@ start_step_over (struct lwp_info *lwp)
>
> current_thread = saved_thread;
>
> - linux_resume_one_lwp (lwp, step, 0, NULL);
> + TRY
> + {
> + linux_resume_one_lwp_throw (lwp, step, 0, NULL);
> + }
IIUC, we do TRY/CATCH here because thread may have exited, but we've
done that in this function (start_step_over), do we still need to worry
about these exited threads?
> + CATCH (ex, RETURN_MASK_ERROR)
> + {
> + unstop_all_lwps (1, lwp);
> +
> + if (!check_ptrace_stopped_lwp_gone (lwp))
I am thinking about the order of these two function calls. Does the
order matter here? Probably we need to check_ptrace_stopped_lwp_gone
first before unstop_all_lwps.
> + throw_exception (ex);
> + return 0;
> + }
> + END_CATCH
>
> /* Require next event from this LWP. */
> step_over_bkpt = thread->entry.id;
> @@ -4270,6 +4411,39 @@ finish_step_over (struct lwp_info *lwp)
> return 0;
> }
>
> +/* If there's a step over in progress, wait until all threads stop
> + (that is, until the stepping thread finishes its step), and
> + unsuspend all lwps. The stepping thread ends with its status
> + pending, which is processed later when we get back to processing
> + events. */
> +
> +static void
> +complete_ongoing_step_over (void)
> +{
> + if (!ptid_equal (step_over_bkpt, null_ptid))
> + {
> + struct lwp_info *lwp;
> + int wstat;
> + int ret;
> +
> + if (debug_threads)
> + debug_printf ("detach: step over in progress, finish it first\n");
"detach:" in the debug output looks obsolete.
--
Yao (éå)