This is the mail archive of the gdb@sourceware.org mailing list for the GDB project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: Strange behavior of sigstep-threads.exp?


Pedro Alves <palves@redhat.com> writes:

> Whenever each of those single-steps is done, the other thread (let's
> call it thread #2) is allowed to run free (the "set scheduler-locking
> off" setting).

Got it.  Seems I misunderstood the "step" behavior in this case.  Thanks
for the explanation.

>> On s390x the test case actually fails sometimes.  In those cases,
>> when stepping from step-1 to step-2, a ton of SIGUSR1 are indicated,
>> and then the inferior seems to stop at the closing brace of the
>> handler() function instead of the tgkill().
>
> That does sound like something's wrong.  Hacking the the test to force
> "set debug infrun 1" and "set debug lin-lwp 1" would be my first move.

As soon as "lin-lwp" debugging is turned on the test always seems to
succeed.  But with "debug infrun" alone the failure still occurs, and I
observe the following:

1. The stepped thread reaches the last instruction inside the stepping
range.

2. After resuming the stepped thread again, it traps at getpid@plt.
Which is curious, because getpid() shouldn't be called until the
instruction _after_ the stepping range.  It seems like the trap for that
instruction was missed somehow.  (In the good case the thread always
traps at the subroutine call, before having carried out the call.)

3. The thread is single-stepped until the jump to getpid().  The
getpid() invocation itself is skipped with a step-resume breakpoint on
the instruction after the original subroutine call.

4. The step-resume breakpoint is reached.  Despite now being well
outside the original stepping range, the thread is resumed.  Upon the
next trap, an updated stepping range is shown, adjusted to fit the line
of the tgkill().  Then stepping continues until the next line, which is
the closing brace.

> I wonder if this makes a difference?
>
>  gdb/infrun.c |    6 ++++--
>  1 file changed, 4 insertions(+), 2 deletions(-)
>
> diff --git a/gdb/infrun.c b/gdb/infrun.c
> index de2cf19..9621b84 100644
> --- a/gdb/infrun.c
> +++ b/gdb/infrun.c
> @@ -4698,8 +4698,10 @@ process_event_stop_test:
>  	  ecs->event_thread = tp;
>  	  ecs->ptid = tp->ptid;
>  	  context_switch (ecs->ptid);
> -	  keep_going (ecs);
> -	  return;
> +
> +	  /* Keep checking.  The stepped thread might have already
> +	     reached its destination, but not have reported it yet.
> +	     If we just kept going, we could end up overstepping.  */
>  	}
>      }

Yes, it does make a difference.  The test case still fails at a similar
rate as before, but this time after "continue", because the inferior
reaches "assert (0)".  Again, I can not reproduce this failure with "set
debug infrun 1" and "set debug lin-lwp 1".


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]