This is the mail archive of the gdb@sourceware.org mailing list for the GDB project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: Debugging return.exp on ARM


On 16-05-26 03:11 PM, Pedro Alves wrote:

Thanks for the suggestions.

> - I'd suspect something odd with caches / barriers too.
>   Did you try sprinkling in memory barrier instructions, and
>   see whether it makes a difference?

I tried to put some dmb a bit everywhere, it didn't help.

> - I'd also try "si" + "info regs" instead of "next" after the return,
>   and see if a register with a bad value pops up always at some
>   specific instruction.

Good point.

If I replace next with si, only the vmov.f64 d7, d0 gets executed.  So if everything
goes well, I should have the "right" value in both d0 and d7.  I made a more
focused reproducer, see below.

> - I'd try to see if pinning the thread to a core makes a difference.

Indeed, pinning GDB to a single CPU makes it work (as in the result is right) every time.
As far as I can tell, pinning the inferior has no effect (I am not sure i worked, but I
used "set exec-wrapper taskset 0xffffffff" to reset the affinity).

> - Might help to show the kernel version.

ODroid: Linux odroid 3.10.96+ #5 SMP PREEMPT Thu May 26 15:03:58 EDT 2016 armv7l armv7l armv7l GNU/Linux
Firefly: Linux firefly 3.10.0 #40 SMP PREEMPT Tue Jan 27 16:12:04 CST 2015 armv7l armv7l armv7l GNU/Linux

I also reproduced it on my Rasp Pi 2, which has:
Linux alarmpi 4.4.8-2-ARCH #1 SMP Tue Apr 26 19:14:58 MDT 2016 armv7l GNU/Linux

So here's another case that reproduces the problem, but without a memory read, so
it isolates the problem a bit more.  It verifies whether the thread sees our register
write or not.

test.S:

  .global _start
  _start:
      vldr.64 d0, constante
      vldr.64 d1, constante

  break_here:
      vcmp.f64 d0, d1
      vmrs APSR_nzcv, fpscr

      # Exit code
      moveq r0, #1
      movne r0, #0

      # Exit syscall
      mov r7, #1
      svc 0

  .align 8
  constante:
  .word 0xc8b43958
  .word 0x40594676

Built with:

  $ gcc -g3 -O0 -o test test.S -nostdlib

And the gdb script test.gdb:

  file test
  b break_here
  run
  p $d0 = 4.0
  c

The test is ran with

  $ ./gdb -nx -x test.gdb -batch


The test loads the same constant in d0 and d1.  It then does a comparison between
them and exits with 1 (failure) if they are the same, 0 (success) if they are different.
The GDB script breaks at "break_here", tries to change the value of d0 to some other
constant (4.0) and lets the program continue and exit.  If our register write succeeded,
the program should exit with 0 (values are different).  If our register write failed, the
program will exit with 1 (values are still the same).

The result is that I randomly see both cases, hinting that the race is really between the
register write through ptrace and the kernel restoring the thread's vfp registers.  Again,
pinning GDB to a single code seems to hide/bypass the bug.

Simon


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]