[RFA] use frame IDs to detect function calls while stepping

Joel Brobecker brobecker@gnat.com
Tue Mar 2 06:16:00 GMT 2004


> Here is a description of the regression which appears if we don't
> include the following test:
> 
> +      if (IN_SOLIB_CALL_TRAMPOLINE (stop_pc, ecs->stop_func_name))
> +        {
> +          /* We landed in a shared library call trampoline, so it
> +             is a subroutine call.  */
> +          handle_step_into_function (ecs);
> +          return;
> +        }
> 
> The problem occurs with ending-run:
> 
>         % gdb ending-run
>         (gdb) b ending-run.c:33
>         Breakpoint 1 at 0x10828: file gdb.base/ending-run.c, line 33.
>         (gdb) run
>         Starting program: /[...]/gdb.base/ending-run 
>         -1 2 7 14 23 34 47 62 79  Goodbye!
>         
>         Breakpoint 1, main () at gdb.base/ending-run.c:33
>         33      }
>         (gdb) n
>         0x000105ec in _start ()
>         (gdb) n
>         Single stepping until exit from function _start, 
>         which has no line number information.
>         0x00020950 in _PROCEDURE_LINKAGE_TABLE_ ()
> 
> The expected behavior for the last "next" command is for GDB
> to run until the inferior exits:
> 
>         (gdb) n
>         Single stepping until exit from function _start, 
>         which has no line number information.
>         
>         Program exited normally.
> 
> Unfortunately, here is what happens. At 0x000105ec, before we do
> our second "next" command, we are about to execute the following
> code:
> 
>         0x000105ec <_start+100>:        call  0x20950 <exit>
>         0x000105f0 <_start+104>:        nop 
> 
> After two iterations (one for the call insn, and one for the delay
> slot), GDB lands at the begining of function "exit" at 0x00020950,
> which is:
> 
>         0x00020950 <exit+0>:    sethi  %hi(0xf000), %g1
>         0x00020954 <exit+4>:    b,a   0x20914 <_PROCEDURE_LINKAGE_TABLE_>
>         0x00020958 <exit+8>:    nop 
> 
> So at this point, the registers window has not been rotated.
> I don't know if this is the cause for this problem, but at this
> point GDB is unable to unwind the call stack:
> 
>         (gdb) bt
>         #0  0x00020950 in _PROCEDURE_LINKAGE_TABLE_ ()
> 
> (And gets the wrong procedure name as well, but that's a separate
> issue - although "x /i" does report what I believe is the correct
> name, strange!).
> 
> I am looking into the sparc unwinder code right now, to try to
> understand a bit better the source of the problem.

I think I found the source of the glitch. I may have the solution
to fix it, but my little finger is telling that it might be a bit
too extreme... Maybe MarkK has some comments about this?

What happens is that, at the point when we reach function "exit",
the FP register is null:

        (gdb) p /x $fp
        $2 = 0x0

The sparc unwinder in sparc_frame_cache() detects this, thinks there is
something wrong, and aborts early. So, we never unwind the "_start"
frame, and hence the following frame ID check doesn't notice the
function call, as it should have in this case:

+      if (frame_id_eq (get_frame_id (get_prev_frame (get_current_frame ())),
+                       step_frame_id))
+        {
+          /* It's a subroutine call.  */
+          handle_step_into_function (ecs);
+          return;
+        }

With this example in mind, it seemed to me that the assertion that %fp
register is not null is unfortunately incorrect. Given that the rest of
the code in sparc_frame_cache() wasn't using the value of that register,
I commented out the assertion, and retried.

<<
@@ -620,8 +620,8 @@ sparc_frame_cache (struct frame_info *ne
      frame.  */
 
   cache->base = frame_unwind_register_unsigned (next_frame, SPARC_FP_REGNUM);
-  if (cache->base == 0)
-    return cache;
+  // if (cache->base == 0)
+  //   return cache;
 
   cache->pc = frame_func_unwind (next_frame);
   if (cache->pc != 0)
>>

That causes the problem above to disappear. I even went as far as
to run the testsuite: No change (apart from fixing the regression
I observed).

sparc_frame_cache() seems well designed to handle null %fp registers.
It doesn't use its value when scanning the prologue, and then knows
to use %sp in its place as the frame base. So the frame should be
correctly unwound.

Comments?
-- 
Joel



More information about the Gdb-patches mailing list