On aarch64-linux (debian 12) I ran into: ... (gdb) PASS: gdb.arch/disp-step-insn-reloc.exp: can_relocate_adr_forward: go to breakpoint 6 continue^M Continuing.^M ^M Breakpoint 8, can_relocate_adr_forward () at /home/linux/gdb/src/gdb/testsuite/gdb.arch/insn-reloc.c:356^M 356 asm ("set_point6:\n"^M (gdb) FAIL: gdb.arch/disp-step-insn-reloc.exp: can_relocate_adr_forward: relocated instruction ...
Doesn't reproduce all the time, but often enough: ... $ for n in $(seq 1 10); do ./test.sh 2>&1 | grep "# of " | sort -u; done # of expected passes 49 # of expected passes 37 # of unexpected failures 12 # of expected passes 49 # of expected passes 37 # of unexpected failures 12 # of expected passes 49 # of expected passes 36 # of unexpected failures 13 # of expected passes 49 # of expected passes 49 # of expected passes 46 # of unexpected failures 3 # of expected passes 49 ...
Created attachment 15413 [details] gdb.log without fail
Created attachment 15414 [details] gdb.log with fail
First relevant difference in gdb.log: ... @@ -221,132 +221,133 @@ (gdb) continue Continuing. -Breakpoint 8, can_relocate_adr_forward () at /home/linux/gdb/src/gdb/testsuite/gdb.arch/insn-reloc.c:356 -356 asm ("set_point6:\n" +Breakpoint 17, pass () at /home/linux/gdb/src/gdb/testsuite/gdb.arch/insn-reloc.c:31 +31 } (gdb) PASS: gdb.arch/disp-step-insn-reloc.exp: can_relocate_adr_forward: go to breakpoint 6 continue Continuing. -Breakpoint 17, pass () at /home/linux/gdb/src/gdb/testsuite/gdb.arch/insn-reloc.c:31 -31 } -(gdb) PASS: gdb.arch/disp-step-insn-reloc.exp: can_relocate_adr_forward: relocated instruction +Breakpoint 8, can_relocate_adr_forward () at /home/linux/gdb/src/gdb/testsuite/gdb.arch/insn-reloc.c:356 +356 asm ("set_point6:\n" +(gdb) FAIL: gdb.arch/disp-step-insn-reloc.exp: can_relocate_adr_forward: relocated instruction ...
More minimal version: ... $ cat gdb.in file /home/linux/gdb/build/gdb/testsuite/outputs/gdb.arch/disp-step-insn-reloc/disp-step-insn-reloc start break *set_point0 break *set_point1 break *set_point2 break pass break fail set displaced-stepping on continue bt continue bt continue bt continue bt continue bt continue bt $ gdb -q -batch -ex "set trace-commands on" -x gdb.in 2>&1 | tee LOG; grep Breakpoint LOG ... Usually, we have: ... Breakpoint 2, 0x0000aaaaaaaa08bc in can_relocate_b () at /home/linux/gdb/src/gdb/testsuite/gdb.arch/insn-reloc.c:128 Breakpoint 5, pass () at /home/linux/gdb/src/gdb/testsuite/gdb.arch/insn-reloc.c:31 Breakpoint 3, 0x0000aaaaaaaa090c in can_relocate_bcond_true () at /home/linux/gdb/src/gdb/testsuite/gdb.arch/insn-reloc.c:164 Breakpoint 5, pass () at /home/linux/gdb/src/gdb/testsuite/gdb.arch/insn-reloc.c:31 Breakpoint 4, 0x0000aaaaaaaa0958 in can_relocate_cbz () at /home/linux/gdb/src/gdb/testsuite/gdb.arch/insn-reloc.c:203 Breakpoint 5, pass () at /home/linux/gdb/src/gdb/testsuite/gdb.arch/insn-reloc.c:31 ... But once in a while: ... Breakpoint 2, 0x0000aaaaaaaa08bc in can_relocate_b () at /home/linux/gdb/src/gdb/testsuite/gdb.arch/insn-reloc.c:128 Breakpoint 5, pass () at /home/linux/gdb/src/gdb/testsuite/gdb.arch/insn-reloc.c:31 Breakpoint 3, 0x0000aaaaaaaa090c in can_relocate_bcond_true () at /home/linux/gdb/src/gdb/testsuite/gdb.arch/insn-reloc.c:164 Breakpoint 5, pass () at /home/linux/gdb/src/gdb/testsuite/gdb.arch/insn-reloc.c:31 Breakpoint 5, pass () at /home/linux/gdb/src/gdb/testsuite/gdb.arch/insn-reloc.c:31 Breakpoint 4, 0x0000aaaaaaaa0958 in can_relocate_cbz () at /home/linux/gdb/src/gdb/testsuite/gdb.arch/insn-reloc.c:203 ... Looking at the backtrace, we hit pass twice without making progress: ... +continue Breakpoint 5, pass () at /home/linux/gdb/src/gdb/testsuite/gdb.arch/insn-reloc.c:31 31 } +bt #0 pass () at /home/linux/gdb/src/gdb/testsuite/gdb.arch/insn-reloc.c:31 #1 0x0000aaaaaaaa0928 in can_relocate_bcond_true () at /home/linux/gdb/src/gdb/testsuite/gdb.arch/insn-reloc.c:179 #2 0x0000aaaaaaaa0c9c in main () at /home/linux/gdb/src/gdb/testsuite/gdb.arch/insn-reloc.c:629 +continue Breakpoint 5, pass () at /home/linux/gdb/src/gdb/testsuite/gdb.arch/insn-reloc.c:31 31 } +bt #0 pass () at /home/linux/gdb/src/gdb/testsuite/gdb.arch/insn-reloc.c:31 #1 0x0000aaaaaaaa0928 in can_relocate_bcond_true () at /home/linux/gdb/src/gdb/testsuite/gdb.arch/insn-reloc.c:179 #2 0x0000aaaaaaaa0c9c in main () at /home/linux/gdb/src/gdb/testsuite/gdb.arch/insn-reloc.c:629 ...
OK, so this hits the "PC did not move. Discarding PC adjustment" case: ... Breakpoint 5, pass () at /home/linux/gdb/src/gdb/testsuite/gdb.arch/insn-reloc.c:31 31 } +x /i $pc => 0xaaaaaaaa0894 <pass>: nop +continue [displaced] displaced_step_prepare_throw: displaced-stepping 1307111.1307111.0 now [displaced] displaced_step_prepare_throw: original insn 0xaaaaaaaa0894: 1f 20 03 d5 nop [displaced] prepare: selected buffer at 0xaaaaaaaa0788 [displaced] prepare: saved 0xaaaaaaaa0788: 1e 00 80 d2 [displaced] aarch64_displaced_step_copy_insn: writing insn d503201f at 0xaaaaaaaa0788 [displaced] displaced_step_prepare_throw: prepared successfully thread=1307111.1307111.0, original_pc=0xaaaaaaaa0894, displaced_pc=0xaaaaaaaa0788 [displaced] displaced_step_prepare_throw: replacement insn 0xaaaaaaaa0788: 1f 20 03 d5 nop [displaced] finish: restored 1307111.1307111.0 0xaaaaaaaa0788 [displaced] aarch64_displaced_step_fixup: PC after stepping: 0xaaaaaaaa0788 (was 0xaaaaaaaa0788). [displaced] aarch64_displaced_step_fixup: adjusting PC by 4 [displaced] aarch64_displaced_step_fixup: PC did not move. Discarding PC adjustment. [displaced] aarch64_displaced_step_fixup: fixup: set PC to 0xaaaaaaaa0894:0 Breakpoint 5, pass () at /home/linux/gdb/src/gdb/testsuite/gdb.arch/insn-reloc.c:31 31 } +x /i $pc => 0xaaaaaaaa0894 <pass>: nop ...
This may be a regression due to: ... commit 0c27188999bfc5bf03536bf44593c4ed8df296c3 Author: Luis Machado <luis.machado@linaro.org> Date: Thu Jan 9 16:04:36 2020 -0300 Fix step-over-syscall.exp failure In particular, this one: FAIL: gdb.base/step-over-syscall.exp: fork: displaced=on: check_pc_after_cross_syscall: single step over fork final pc When ptrace fork event reporting is enabled, GDB gets a PTRACE_EVENT_FORK event whenever the inferior executes the fork syscall. Then the logic is that GDB needs to step the inferior yet again in order to receive a predetermined SIGTRAP, but no execution takes place because the signal was already queued for delivery. That means the PC should stay the same. I noticed the aarch64 code is currently adjusting the PC in this situation, making the inferior skip an instruction without executing it. The following change checks if we did not execute the instruction (pc - to == 0), making proper adjustments for such case. Regression tested on aarch64-linux-gnu on the tryserver. gdb/ChangeLog: 2020-01-21 Luis Machado <luis.machado@linaro.org> ... Luis, could you take a look?
Sure. I vaguely recall the situation with 0c27188999bfc5bf03536bf44593c4ed8df296c3 I'm wondering if we're incorrectly identifying an instruction that we're trying to displaced-step and going through an incorrect outcome.
I tried running this test with RACY_ITER=100 but didn't see any FAIL's. It comes out as non-racy.
Ok. This doesn't reproduce for me even if I run it 1000 times. I'll check the log file more thoroughly to see if anything rings a bell.
(In reply to Luis Machado from comment #10) > Ok. This doesn't reproduce for me even if I run it 1000 times. That's unfortunate. Let me try to describe the setup, in case that helps in any way: - lenovo ideapad 3 chromebook, SOC mt8183 (4 Cortex-A73, 4 Cortex-A53), 4GB RAM - debian 12, installed from https://github.com/hexdump0815/imagebuilder/blob/main/systems/chromebook_kukui/readme.md - system up-to-date - uname -a: Linux changeme 6.1.51-stb-mt8+ #1 SMP PREEMPT Tue Sep 5 16:08:26 CEST 2023 aarch64 GNU/Linux - ldd (Debian GLIBC 2.36-9+deb12u4) 2.36 - gcc version 12.2.0 (Debian 12.2.0-14) - GNU assembler (GNU Binutils for Debian) 2.40 - gdb build with -O0 -g -fuse-ld=mold - build at commit 6549a232d25 ("Fix compiling bfd/vms-lib.c for a 32-bit host.") I tried my usual tricks of "taskset -c 0" and "stress -c 8", but found that this makes the fail less likely. I haven't been able to reproduce on either pinebook pro (running manjaro) or m1 macbook (running fedora asahi remix).
Ok, that's useful information. Let me try to get the kernel + tools versions right first, and then I'll make another attempt at reproducing things. It is a bit suspicious that we're having an issue with a nop instruction. But let me play with it some and see if I find anything. As usual, the case the patch was trying to address was a bit odd.
(In reply to Tom de Vries from comment #7) > This may be a regression due to: > ... > commit 0c27188999bfc5bf03536bf44593c4ed8df296c3 > Author: Luis Machado <luis.machado@linaro.org> > Date: Thu Jan 9 16:04:36 2020 -0300 > > Fix step-over-syscall.exp failure I've confirmed this, ran a loop with 500 iterations, didn't fail before the commit, fails after the commit.
Thanks for confirming Tom.
Just a quick update. I've tried this on different hardware with a newer Ubuntu (22.04). I couldn't get it to reproduce yet. I'm trying a few more things.
I've also ran into this with gdb.dwarf2/dw2-lines.exp.