[PATCH] [RFC][AArch64] Fix some erroneous behavior in gdb.base/step-over-syscall.exp
Luis Machado
luis.machado@linaro.org
Fri Jan 10 19:54:00 GMT 2020
On 1/9/20 12:21 PM, Alan Hayward wrote:
>
>
>> On 30 Dec 2019, at 16:24, Luis Machado <luis.machado@linaro.org> wrote:
>>
>> I've been chasing/investigating this particular bug for a while now, and this
>> is a tentative patch to fix the problem.
>>
>> Although this may not be aarch64-specific, i can reproduce it consitently in
>> this particular architecture.
>>
>> In summary, this bug is a mix of random signal handling (SIGCHLD) and
>> fork/vfork handling when single-stepping.
>>
>> Forks and vforks are handled in 3 to 4 steps.
>>
>> Fork
>>
>> 1 - We get a PTRACE_EVENT_FORK event.
>> 2 - We single-step the inferior to get a SIGTRAP.
>> 3 - Inferior is ready to be resumed as we wish.
>>
>> Vfork
>>
>> 1 - We get a PTRACE_EVENT_VFORK event.
>> 2 - We continue the inferior to get a PTRACE_EVENT_VFORK_DONE event.
>> 3 - We single-step the inferior to get a SIGTRAP.
>> 4 - Inferior is ready to be resumed as we wish.
>>
>> The problem manifests itself when we are sitting at a syscall instruction and
>> we try to instruction-step past it. There may or may not be a breakpoint
>> inserted at the syscall instruction location.
>>
>> The expected outcome is that we step a single instruction and the resulting PC
>> points at the next instruction ($pc + 4 in aarch64).
>>
>> What i see is that we end up in $pc + 8, that is, one instruction further than
>> we should've been.
>>
>> This happens because, when forking/vforking, the child exits too quickly (in the
>> particular case of gdb.base/step-over-syscall.exp) and a SIGCHLD signal is
>> issued to the parent. Sometimes this signal arrives before GDB is done handling
>> the fork/vfork situation. Usually in step 2 for fork and step 3 for vfork.
>>
>> In turn, this causes GDB to handle that SIGCHLD as a random signal that must be
>> passed to the inferior, which is correct. But this particular code in infrun.c
>> doesn't cope with this particular situation and GDB inserts a HP step-resume
>> breakpoint and registers its wish for the inferior to be stepped yet again.
>>
>> Thus we end up in $pc + 8, even though we shouldn't have stepped again in this
>> particular situation.
>
> Iâve tried debugging this issue in the past, and didnât figure out the issue.
> This _sounds_ like a reasonable explanation of why.
>
>>
>> The delivery of SIGCHLD in between fork/vfork handling steps is sensitive to
>> timing. If we tweak things a little, enable debugging output or do any other
>> thing that affects the timing, we may not see this behavior.
>>
>> This bug doesn't show up as failures in gdb.base/step-over-syscall.exp due to
>> the way the test is written. As long as the PC is the same from the previous
>> test to the next, GDB thinks it is good. The proposed patch doesn't cause a
>> change in the number of PASSes/FAIL's.
>>
>
> Is it possible to add an explicit test to test for the issue? (possibly as a new
> .exp file).
I'll have to think about that. The timing-sensitive nature of this
problem is within the kernel. The kernel lets the child process run and
then notifies the parent. I can delay the child, but i can't delay the
kernel notifying the parent.
If i could defer the kernel notification to the parent enough so the
child exits meanwhile, that would work.
>
>
>> The proposed patch adds a variable waiting_for_fork_sigtrap that gets set just
>> before the last step of fork/vfork handling and gets reset when we end up
>> seeing that SIGTRAP we wanted.
>>
>> This variable is used in the random_signal handling of handle_signal_stop,
>> where there are a couple conditional blocks that handle signals reaching the
>> inferior at an unexpected time. If waiting_for_fork_sigtrap is set, we don't
>> set GDB to do the additional single-step it wants to do.
>>
>> I gave this a thought and tried to find a better place to put the check in,
>> but this seemed the most appropriate place to me.
>>
>> I also tried to solve this without having to introduce a new variable.
>> But it seems we can't distinguish between a random signal reaching GDB in
>> the middle of single-stepping and a random signal reaching GDB in the middle
>> of single-stepping AND handling fork/vfork.
>>
>> Thoughts?
>>
>> gdb/ChangeLog:
>>
>> 2019-12-30 Luis Machado <luis.machado@linaro.org>
>>
>> * inferior.h (class inferior) <waiting_for_fork_sigtrap>: New member.
>> Set to false by default.
>> * infrun.c (handle_inferior_event): Set waiting_for_fork_sigtrap when
>> a fork or vfork_done is detected.
>> (handle_signal_stop): Don't step again if waiting_for_fork_sigtrap is
>> true.
>> Reset waiting_for_fork_sigtrap to false when nexti/stepi is detected.
>>
>> Change-Id: I2849e0dc49ad9c0a026daa8ced4610aa0ddbe637
>> ---
>> gdb/inferior.h | 12 ++++++++++++
>> gdb/infrun.c | 31 +++++++++++++++++++++++++++++--
>> 2 files changed, 41 insertions(+), 2 deletions(-)
>>
>> diff --git a/gdb/inferior.h b/gdb/inferior.h
>> index 3bd9e8c3d7..2c23d7302b 100644
>> --- a/gdb/inferior.h
>> +++ b/gdb/inferior.h
>> @@ -447,6 +447,18 @@ public:
>> exits or execs. */
>> bool pending_detach = false;
>>
>> + /* True if we are still waiting for the final SIGTRAP when handling
>> + fork or vfork events. This is used in case a signal gets caught amidst
>> + handling fork and vfork events.
>> +
>> + For fork events we first get a PTRACE_EVENT_FORK, then SINGLESTEP the
>> + process and then get a SIGTRAP.
>> +
>> + For vfork events we first get a PTRACE_EVENT_VFORK, then we CONTINUE
>> + the process, get a PTRACE_EVENT_VFORK_DONE event, SINGLESTEP the process
>> + again and then get a SIGTRAP. */
>> + bool waiting_for_fork_sigtrap = false;
>> +
>
>
> What happens with a multithreaded process that forks/vforks? Should this bool be
> held per thread instead? (A test would be nice to have here).
>
>
I'll have to try that. Unlike waiting for vfork-done, where the parent
can't move until the child is done with the shared memory region,
waiting for this final SIGTRAP doesn't have this particular restriction.
Maybe Pedro has a better insight into this particular case.
>> /* True if this inferior is a vfork parent waiting for a vfork child
>> not under our control to be done with the shared memory region,
>> either by exiting or execing. */
>> diff --git a/gdb/infrun.c b/gdb/infrun.c
>> index 7ddd21dd09..c3316a891a 100644
>> --- a/gdb/infrun.c
>> +++ b/gdb/infrun.c
>> @@ -4937,6 +4937,10 @@ Cannot fill $_exitsignal with the correct signal number.\n"));
>> struct regcache *regcache = get_thread_regcache (ecs->event_thread);
>> struct gdbarch *gdbarch = regcache->arch ();
>>
>> + if (ecs->ws.kind == TARGET_WAITKIND_FORKED)
>> + /* We should see a SIGTRAP next time we stop. */
>> + current_inferior ()->waiting_for_fork_sigtrap = true;
>> +
>> /* If checking displaced stepping is supported, and thread
>> ecs->ptid is displaced stepping. */
>> if (displaced_step_in_progress_thread (ecs->event_thread))
>> @@ -5094,6 +5098,8 @@ Cannot fill $_exitsignal with the correct signal number.\n"));
>>
>> context_switch (ecs);
>>
>> + /* We should see a SIGTRAP next time we stop. */
>> + current_inferior ()->waiting_for_fork_sigtrap = true;
>> current_inferior ()->waiting_for_vfork_done = 0;
>> current_inferior ()->pspace->breakpoints_not_allowed = 0;
>>
>> @@ -5912,7 +5918,15 @@ handle_signal_stop (struct execution_control_state *ecs)
>> "breakpoint\n");
>>
>> insert_hp_step_resume_breakpoint_at_frame (frame);
>> - ecs->event_thread->step_after_step_resume_breakpoint = 1;
>> +
>> + /* If we're still waiting for the final fork/vfork SIGTRAP and we
>> + got interrupted by a random signal, don't step further after
>> + returning from the signal handler. */
>> + if (current_inferior ()->waiting_for_fork_sigtrap)
>> + ecs->event_thread->step_after_step_resume_breakpoint = 0;
>> + else
>> + ecs->event_thread->step_after_step_resume_breakpoint = 1;
>> +
>
> Simpler to write as this?
> ecs->event_thread->step_after_step_resume_breakpoint = !current_inferior ()->waiting_for_fork_sigtrap;
>
>
I'll change it.
>> /* Reset trap_expected to ensure breakpoints are re-inserted. */
>> ecs->event_thread->control.trap_expected = 0;
>>
>> @@ -5947,7 +5961,15 @@ handle_signal_stop (struct execution_control_state *ecs)
>>
>> clear_step_over_info ();
>> insert_hp_step_resume_breakpoint_at_frame (frame);
>> - ecs->event_thread->step_after_step_resume_breakpoint = 1;
>> +
>> + /* If we're still waiting for the final fork/vfork SIGTRAP and we
>> + got interrupted by a random signal, don't step further after
>> + returning from the signal handler. */
>> + if (current_inferior ()->waiting_for_fork_sigtrap)
>> + ecs->event_thread->step_after_step_resume_breakpoint = 0;
>> + else
>> + ecs->event_thread->step_after_step_resume_breakpoint = 1;
>> +
>
> Ditto.
>
I'll change it as well.
More information about the Gdb-patches
mailing list