[PATCH] [RFC][AArch64] Fix some erroneous behavior in gdb.base/step-over-syscall.exp

Thu Jan 9 15:21:00 GMT 2020

> On 30 Dec 2019, at 16:24, Luis Machado <luis.machado@linaro.org> wrote:
> 
> I've been chasing/investigating this particular bug for a while now, and this
> is a tentative patch to fix the problem.
> 
> Although this may not be aarch64-specific, i can reproduce it consitently in
> this particular architecture.
> 
> In summary, this bug is a mix of random signal handling (SIGCHLD) and
> fork/vfork handling when single-stepping.
> 
> Forks and vforks are handled in 3 to 4 steps.
> 
> Fork
> 
> 1 - We get a PTRACE_EVENT_FORK event.
> 2 - We single-step the inferior to get a SIGTRAP.
> 3 - Inferior is ready to be resumed as we wish.
> 
> Vfork
> 
> 1 - We get a PTRACE_EVENT_VFORK event.
> 2 - We continue the inferior to get a PTRACE_EVENT_VFORK_DONE event.
> 3 - We single-step the inferior to get a SIGTRAP.
> 4 - Inferior is ready to be resumed as we wish.
> 
> The problem manifests itself when we are sitting at a syscall instruction and
> we try to instruction-step past it. There may or may not be a breakpoint
> inserted at the syscall instruction location.
> 
> The expected outcome is that we step a single instruction and the resulting PC
> points at the next instruction ($pc + 4 in aarch64).
> 
> What i see is that we end up in $pc + 8, that is, one instruction further than
> we should've been.
> 
> This happens because, when forking/vforking, the child exits too quickly (in the
> particular case of gdb.base/step-over-syscall.exp) and a SIGCHLD signal is
> issued to the parent. Sometimes this signal arrives before GDB is done handling
> the fork/vfork situation. Usually in step 2 for fork and step 3 for vfork.
> 
> In turn, this causes GDB to handle that SIGCHLD as a random signal that must be
> passed to the inferior, which is correct. But this particular code in infrun.c
> doesn't cope with this particular situation and GDB inserts a HP step-resume
> breakpoint and registers its wish for the inferior to be stepped yet again.
> 
> Thus we end up in $pc + 8, even though we shouldn't have stepped again in this
> particular situation.

I’ve tried debugging this issue in the past, and didn’t figure out the issue.
This _sounds_ like a reasonable explanation of why.

> 
> The delivery of SIGCHLD in between fork/vfork handling steps is sensitive to
> timing. If we tweak things a little, enable debugging output or do any other
> thing that affects the timing, we may not see this behavior.
> 
> This bug doesn't show up as failures in gdb.base/step-over-syscall.exp due to
> the way the test is written.  As long as the PC is the same from the previous
> test to the next, GDB thinks it is good. The proposed patch doesn't cause a
> change in the number of PASSes/FAIL's.
> 

Is it possible to add an explicit test to test for the issue? (possibly as a new
.exp file).

> The proposed patch adds a variable waiting_for_fork_sigtrap that gets set just
> before the last step of fork/vfork handling and gets reset when we end up
> seeing that SIGTRAP we wanted.
> 
> This variable is used in the random_signal handling of handle_signal_stop,
> where there are a couple conditional blocks that handle signals reaching the
> inferior at an unexpected time. If waiting_for_fork_sigtrap is set, we don't
> set GDB to do the additional single-step it wants to do.
> 
> I gave this a thought and tried to find a better place to put the check in,
> but this seemed the most appropriate place to me.
> 
> I also tried to solve this without having to introduce a new variable.
> But it seems we can't distinguish between a random signal reaching GDB in
> the middle of single-stepping and a random signal reaching GDB in the middle
> of single-stepping AND handling fork/vfork.
> 
> Thoughts?
> 
> gdb/ChangeLog:
> 
> 2019-12-30  Luis Machado  <luis.machado@linaro.org>
> 
> 	* inferior.h (class inferior) <waiting_for_fork_sigtrap>: New member.
> 	Set to false by default.
> 	* infrun.c (handle_inferior_event): Set waiting_for_fork_sigtrap when
> 	a fork or vfork_done is detected.
> 	(handle_signal_stop): Don't step again if waiting_for_fork_sigtrap is
> 	true.
> 	Reset waiting_for_fork_sigtrap to false when nexti/stepi is detected.
> 
> Change-Id: I2849e0dc49ad9c0a026daa8ced4610aa0ddbe637
> ---
> gdb/inferior.h | 12 ++++++++++++
> gdb/infrun.c   | 31 +++++++++++++++++++++++++++++--
> 2 files changed, 41 insertions(+), 2 deletions(-)
> 
> diff --git a/gdb/inferior.h b/gdb/inferior.h
> index 3bd9e8c3d7..2c23d7302b 100644
> --- a/gdb/inferior.h
> +++ b/gdb/inferior.h
> @@ -447,6 +447,18 @@ public:
>      exits or execs.  */
>   bool pending_detach = false;
> 
> +  /* True if we are still waiting for the final SIGTRAP when handling
> +     fork or vfork events.  This is used in case a signal gets caught amidst
> +     handling fork and vfork events.
> +
> +     For fork events we first get a PTRACE_EVENT_FORK, then SINGLESTEP the
> +     process and then get a SIGTRAP.
> +
> +     For vfork events we first get a PTRACE_EVENT_VFORK, then we CONTINUE
> +     the process, get a PTRACE_EVENT_VFORK_DONE event, SINGLESTEP the process
> +     again and then get a SIGTRAP.  */
> +  bool waiting_for_fork_sigtrap = false;
> +

What happens with a multithreaded process that forks/vforks? Should this bool be
held per thread instead? (A test would be nice to have here).

>   /* True if this inferior is a vfork parent waiting for a vfork child
>      not under our control to be done with the shared memory region,
>      either by exiting or execing.  */
> diff --git a/gdb/infrun.c b/gdb/infrun.c
> index 7ddd21dd09..c3316a891a 100644
> --- a/gdb/infrun.c
> +++ b/gdb/infrun.c
> @@ -4937,6 +4937,10 @@ Cannot fill $_exitsignal with the correct signal number.\n"));
> 	struct regcache *regcache = get_thread_regcache (ecs->event_thread);
> 	struct gdbarch *gdbarch = regcache->arch ();
> 
> +	if (ecs->ws.kind == TARGET_WAITKIND_FORKED)
> +	  /* We should see a SIGTRAP next time we stop.  */
> +	  current_inferior ()->waiting_for_fork_sigtrap = true;
> +
> 	/* If checking displaced stepping is supported, and thread
> 	   ecs->ptid is displaced stepping.  */
> 	if (displaced_step_in_progress_thread (ecs->event_thread))
> @@ -5094,6 +5098,8 @@ Cannot fill $_exitsignal with the correct signal number.\n"));
> 
>       context_switch (ecs);
> 
> +      /* We should see a SIGTRAP next time we stop.  */
> +      current_inferior ()->waiting_for_fork_sigtrap = true;
>       current_inferior ()->waiting_for_vfork_done = 0;
>       current_inferior ()->pspace->breakpoints_not_allowed = 0;
> 
> @@ -5912,7 +5918,15 @@ handle_signal_stop (struct execution_control_state *ecs)
>                                 "breakpoint\n");
> 
> 	  insert_hp_step_resume_breakpoint_at_frame (frame);
> -	  ecs->event_thread->step_after_step_resume_breakpoint = 1;
> +
> +	  /* If we're still waiting for the final fork/vfork SIGTRAP and we
> +	     got interrupted by a random signal, don't step further after
> +	     returning from the signal handler.  */
> +	  if (current_inferior ()->waiting_for_fork_sigtrap)
> +	    ecs->event_thread->step_after_step_resume_breakpoint = 0;
> +	  else
> +	    ecs->event_thread->step_after_step_resume_breakpoint = 1;
> +

Simpler to write as this?
ecs->event_thread->step_after_step_resume_breakpoint = !current_inferior ()->waiting_for_fork_sigtrap;

> 	  /* Reset trap_expected to ensure breakpoints are re-inserted.  */
> 	  ecs->event_thread->control.trap_expected = 0;
> 
> @@ -5947,7 +5961,15 @@ handle_signal_stop (struct execution_control_state *ecs)
> 
> 	  clear_step_over_info ();
> 	  insert_hp_step_resume_breakpoint_at_frame (frame);
> -	  ecs->event_thread->step_after_step_resume_breakpoint = 1;
> +
> +	  /* If we're still waiting for the final fork/vfork SIGTRAP and we
> +	     got interrupted by a random signal, don't step further after
> +	     returning from the signal handler.  */
> +	  if (current_inferior ()->waiting_for_fork_sigtrap)
> +	    ecs->event_thread->step_after_step_resume_breakpoint = 0;
> +	  else
> +	    ecs->event_thread->step_after_step_resume_breakpoint = 1;
> +

Ditto.

> 	  /* Reset trap_expected to ensure breakpoints are re-inserted.  */
> 	  ecs->event_thread->control.trap_expected = 0;
> 	  keep_going (ecs);
> @@ -6691,6 +6713,11 @@ process_event_stop_test (struct execution_control_state *ecs)
>          one instruction.  */
>       if (debug_infrun)
> 	 fprintf_unfiltered (gdb_stdlog, "infrun: stepi/nexti\n");
> +
> +      /* If we were waiting for a final fork/vfork SIGTRAP after a
> +	 single-step, this should be it.  Clear the flag.  */
> +      if (current_inferior ()->waiting_for_fork_sigtrap)
> +	current_inferior ()->waiting_for_fork_sigtrap = false;
>       end_stepping_range (ecs);
>       return;
>     }
> -- 
> 2.17.1
>