Consider test-case gdb.base/unwind-on-each-insn.exp. On x86_64-linux, it passes. Now, let's revert the fix in commit 49d7cd733a7 ("Change calculation of frame_id by amd64 epilogue unwinder"): ... $ git show 49d7cd733a7f1b87aa1d40318b3d7c2b65aca5ac gdb/amd64-tdep.c > tmp.patch $ patch -p1 -R < tmp.patch patching file gdb/amd64-tdep.c Hunk #1 succeeded at 2945 (offset 8 lines). Hunk #2 succeeded at 2994 (offset 8 lines). ... Now we have: ... FAIL: gdb.base/unwind-on-each-insn.exp: instruction 5: $fba_value == $foo_fba FAIL: gdb.base/unwind-on-each-insn.exp: instruction 5: check frame-id matches ... OK, so the test-case acts as regression test for this commit, that seems as intended. For context, let's look at the foo insns: ... 00000000004004bc <foo>: 4004bc: 55 push %rbp 4004bd: 48 89 e5 mov %rsp,%rbp 4004c0: 90 nop 4004c1: 5d pop %rbp 4004c2: c3 ret ... This is a minimal version of the failure: ... $ gdb -q -batch outputs/gdb.base/unwind-on-each-insn/unwind-on-each-insn \ -ex "b *foo" \ -ex run \ -ex "si 3" \ -ex "info frame" \ -ex si \ -ex "info frame" Breakpoint 1 at 0x4004bc Breakpoint 1, 0x00000000004004bc in foo () 0x00000000004004c1 in foo () Stack level 0, frame at 0x7fffffffdae0: rip = 0x4004c1 in foo; saved rip = 0x4004b5 called by frame at 0x7fffffffdaf0 Arglist at 0x7fffffffdad0, args: Locals at 0x7fffffffdad0, Previous frame's sp is 0x7fffffffdae0 Saved registers: rbp at 0x7fffffffdad0, rip at 0x7fffffffdad8 0x00000000004004c2 in foo () Stack level 0, frame at 0x7fffffffdad8: rip = 0x4004c2 in foo; saved rip = 0x4004b5 called by frame at 0x7fffffffdaf0 Arglist at 0x7fffffffdae0, args: Locals at 0x7fffffffdae0, Previous frame's sp is 0x7fffffffdae0 Saved registers: rip at 0x7fffffffdad8 ... The problem is that the "frame at" lists a different address at the last insn. However, we also have this information available: ... 00000068 000000000000001c 0000003c FDE cie=00000030 pc=00000000004004bc..00000000004004c3 LOC CFA rbp ra 00000000004004bc rsp+8 u c-8 00000000004004bd rsp+16 c-16 c-8 00000000004004c0 rbp+16 c-16 c-8 00000000004004c2 rsp+8 c-16 c-8 ... Let's see if it is correct. At 0x00000000004004c1 we have: ... (gdb) p $rbp+16 $1 = (void *) 0x7fffffffdae0 ... At 0x00000000004004c2 we have: ... (gdb) p $rsp+8 $2 = (void *) 0x7fffffffdae0 ... Well, that looks correct. So, why is this info not used? The amd64 epilogue unwinder (the one containing the bug), has the highest priority: ... /* Hook the function epilogue frame unwinder. This unwinder is appended to the list first, so that it supercedes the other unwinders in function epilogues. */ frame_unwind_prepend_unwinder (gdbarch, &amd64_epilogue_frame_unwind); ... Let's try to see what happens if we disable it: ... 0x00000000004004c2 in foo () Stack level 0, frame at 0x7fffffffdbcc: ... Hmm, that's also no good, it seems that the i386_epilogue_frame_unwind kicked in. I don't under stand why this would be active for x86_64 code, maybe that is a bug in itself. Anyway, why does this one have this high priority? Well, the code says: ... /* Hook the function epilogue frame unwinder. This unwinder is appended to the list first, so that it supercedes the DWARF unwinder in function epilogues (where the DWARF unwinder currently fails). */ frame_unwind_append_unwinder (gdbarch, &i386_epilogue_frame_unwind); ... Hmm, so the "currently" refers to July 2009. In the submission discussion I find this ( https://sourceware.org/pipermail/gdb-patches/2009-July/066792.html ): ... > I also think you should add a comment about the specific ordering of > this unwinder. It has to come before the dwarf2 unwinder because GCC > doesn't provide proper CFI for the epilogue, right? Right - I would like to have a way to suppress this unwinder, maybe based on the producer string like other recognized dwarf2-frame quirks, but we can worry about that later. I hope it will be unnecessary with GCC 4.5. ... So, if I understand correctly, the priority decision was done based on gcc 4.4. So maybe it's time to revisit this. I suppose the producer test is doable, but we'll need to decide about trusted vs untrusted versions, which atm I have no clue about. A minor benefit of preferring dwarf is that behaviour will be more similar to targets that do not have epilogue unwinders. Anyway, disabling i386_epilogue_frame_unwind gives me the desired: ... 0x00000000004004c2 in foo () Stack level 0, frame at 0x7fffffffdad0: ... and indeed, the test-case passes again. At the very least, I would like a maintenance command that tells gdb to prefer dwarf over the epilogue unwinder, in order to be able to test the actual dwarf. That could also be used to proactively spot bugs in the epilogue unwinder.
FTR, there are just 5 targets using epilogue unwinders: ... gdb/amd64-tdep.c: frame_unwind_prepend_unwinder (gdbarch, &amd64_epilogue_frame_unwind); gdb/arm-tdep.c: frame_unwind_append_unwinder (gdbarch, &arm_epilogue_frame_unwind); gdb/i386-tdep.c: frame_unwind_append_unwinder (gdbarch, &i386_epilogue_frame_unwind); gdb/nds32-tdep.c: frame_unwind_append_unwinder (gdbarch, &nds32_epilogue_frame_unwind); gdb/rs6000-tdep.c: frame_unwind_append_unwinder (gdbarch, &rs6000_epilogue_frame_unwind); gdb/rs6000-tdep.c: frame_unwind_append_unwinder (gdbarch, &rs6000_epilogue_frame_unwind); ... The arm one doesn't have priority over dwarf. Same for nds32 and rs6000. So, this particularity seems limited to amd64/i386.
Here ( https://gcc.gnu.org/gcc-4.5/changes.html ) I read: ... GCC now generates unwind info also for epilogues. ... So, maybe this was fixed in gcc 4.5 ?
I did a build and test run with i386_epilogue_frame_unwind and amd64_epilogue_frame_unwind disabled, with gcc 7.5.0. No regressions.
(In reply to Tom de Vries from comment #2) > Here ( https://gcc.gnu.org/gcc-4.5/changes.html ) I read: > ... > GCC now generates unwind info also for epilogues. > ... > So, maybe this was fixed in gcc 4.5 ? OK, I think this is referring to: https://gcc.gnu.org/pipermail/gcc-patches/2009-May/261377.html which was committed as cd9c1ca866b6aa5041a352e0ed07ae5f91e235e5 . First tag to contain the commit: 4.5.0.
(In reply to Tom de Vries from comment #4) > (In reply to Tom de Vries from comment #2) > > Here ( https://gcc.gnu.org/gcc-4.5/changes.html ) I read: > > ... > > GCC now generates unwind info also for epilogues. > > ... > > So, maybe this was fixed in gcc 4.5 ? > > OK, I think this is referring to: > https://gcc.gnu.org/pipermail/gcc-patches/2009-May/261377.html which was > committed as cd9c1ca866b6aa5041a352e0ed07ae5f91e235e5 . > > First tag to contain the commit: 4.5.0. And, we already know this: ... if (gcc_4_minor >= 5) cust->set_epilogue_unwind_valid (true); ... So we already scan the producer for https://sourceware.org/pipermail/gdb-patches/2011-June/083429.html , and crucially: ... I find questionable this detection uses DWARF DW_AT_producer. Therefore on binaries with .eh_frame but no (DWARF) debug info the detection fails and epilogue unwinders will get into affect. ... So, that's what's happening. We have dwarf apart from .eh_frame, and we don't use the .eh_frame info because we can't check that the producer is 4.5 or older. So, maybe we should just flip the switch on the default assumption.
(In reply to Tom de Vries from comment #5) > So, maybe we should just flip the switch on the default assumption. Like so: ... diff --git a/gdb/amd64-tdep.c b/gdb/amd64-tdep.c index 76e843ecc35..86e12b4304a 100644 --- a/gdb/amd64-tdep.c +++ b/gdb/amd64-tdep.c @@ -2905,7 +2905,7 @@ amd64_stack_frame_destroyed_p struct compunit_symtab *cust; cust = find_pc_compunit_symtab (pc); - if (cust != NULL && cust->epilogue_unwind_valid ()) + if (cust == NULL || cust->epilogue_unwind_valid ()) return 0; if (target_read_memory (pc, &insn, 1)) diff --git a/gdb/i386-tdep.c b/gdb/i386-tdep.c index 580664d2ce5..46ad2913788 100644 --- a/gdb/i386-tdep.c +++ b/gdb/i386-tdep.c @@ -2222,7 +2222,7 @@ i386_stack_frame_destroyed_p struct compunit_symtab *cust; cust = find_pc_compunit_symtab (pc); - if (cust != NULL && cust->epilogue_unwind_valid ()) + if (cust == NULL || cust->epilogue_unwind_valid ()) return 0; if (target_read_memory (pc, &insn, 1)) ...
(In reply to Tom de Vries from comment #6) > (In reply to Tom de Vries from comment #5) > > So, maybe we should just flip the switch on the default assumption. > > Like so: > ... > diff --git a/gdb/amd64-tdep.c b/gdb/amd64-tdep.c > index 76e843ecc35..86e12b4304a 100644 > --- a/gdb/amd64-tdep.c > +++ b/gdb/amd64-tdep.c > @@ -2905,7 +2905,7 @@ amd64_stack_frame_destroyed_p > struct compunit_symtab *cust; > > cust = find_pc_compunit_symtab (pc); > - if (cust != NULL && cust->epilogue_unwind_valid ()) > + if (cust == NULL || cust->epilogue_unwind_valid ()) > return 0; > > if (target_read_memory (pc, &insn, 1)) > diff --git a/gdb/i386-tdep.c b/gdb/i386-tdep.c > index 580664d2ce5..46ad2913788 100644 > --- a/gdb/i386-tdep.c > +++ b/gdb/i386-tdep.c > @@ -2222,7 +2222,7 @@ i386_stack_frame_destroyed_p > struct compunit_symtab *cust; > > cust = find_pc_compunit_symtab (pc); > - if (cust != NULL && cust->epilogue_unwind_valid ()) > + if (cust == NULL || cust->epilogue_unwind_valid ()) > return 0; > > if (target_read_memory (pc, &insn, 1)) > ... And likewise: ... diff --git a/gdb/dwarf2/read.c b/gdb/dwarf2/read.c index 44b54f77de9..44e85ebcfe9 100644 --- a/gdb/dwarf2/read.c +++ b/gdb/dwarf2/read.c @@ -8484,7 +8484,15 @@ process_full_comp_unit (dwarf2_cu *cu, enum language pretend_la nguage) if (cu->has_loclist && gcc_4_minor >= 5) cust->set_locations_valid (true); - if (gcc_4_minor >= 5) + if (cu->producer == nullptr) + /* In absence of producer information, optimistically assume that we're + not dealing with gcc < 4.5.0. */ + cust->set_epilogue_unwind_valid (true); + if (!producer_is_gcc (cu->producer, nullptr, nullptr)) + /* Not gcc. */ + cust->set_epilogue_unwind_valid (true); + else if (gcc_4_minor >= 5) + /* gcc >= 4.5.0. */ cust->set_epilogue_unwind_valid (true); cust->set_call_site_htab (cu->call_site_htab); ...
Submitted rfc: https://sourceware.org/pipermail/gdb-patches/2023-January/196035.html
Fixed by commits: - 8908d9c45cd ("[gdb/symtab] Trust epilogue unwind info for unknown producer (-g0 case)") - 868014341a7 ("[gdb/symtab] Trust epilogue unwind info for unknown or non-gcc producer")