This is the mail archive of the gdb-patches@sources.redhat.com mailing list for the GDB project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [rfa/mips] Stop backtraces when we've lost the PC


On Sun, Mar 07, 2004 at 10:23:24PM -0500, Daniel Jacobowitz wrote:
> On Sun, Mar 07, 2004 at 07:56:18PM -0500, Andrew Cagney wrote:
> > >Here's an updated version of a little hack I've been using since GDB 6.0.
> > >If we are in a nested normal frame, i.e. something whose next frame is a
> > >function that it called in the normal way, and we didn't find a saved PC,
> > >we're going to be stuck in a loop.  We might have been able to figure out
> > >the frame size, but not where the return address was stored; as the comment
> > >says, this happens in glibc's clone function.  Of course the problem there
> > >is that it _doesn't_ save $ra in the normal fashion; it won't return.
> > >
> > >Without this patch schedlock.exp falls apart, because backtraces continue
> > >forever printing "clone()" on every line.
> > >
> > >OK?
> > 
> > No!
> > 
> > >+  if (frame_relative_level (next_frame) >= 0
> > >+      && get_frame_type (next_frame) == NORMAL_FRAME
> > >+      && !trad_frame_addr_p (info->saved_regs, NUM_REGS + PC_REGNUM))
> > 
> > The whole point of this recursive code is that you don't need to add 
> > hacks that look down the stack at the type of more inner frames :-/
> > 
> > Can you post the relevant assembler so that we can determine why the 
> > prologue analyzer is failing.
> 
> This is clone.  It can't be unwound past - there is no stack behind it
> (we're in the child, here).
> 
> 00113730 <__clone>:
>   113730:       3c1c000c        lui     gp,0xc
>   113734:       279ca5c0        addiu   gp,gp,-23104
>   113738:       0399e021        addu    gp,gp,t9
>   11373c:       27bdffe8        addiu   sp,sp,-24
>   113740:       afbc0014        sw      gp,20(sp)
>   113744:       10800010        beqz    a0,113788 <__clone+0x58>
>   113748:       24020016        li      v0,22
>   11374c:       10a0000e        beqz    a1,113788 <__clone+0x58>
>   113750:       00000000        nop
>   113754:       24a5ffe0        addiu   a1,a1,-32
>   113758:       aca40000        sw      a0,0(a1)
>   11375c:       aca70004        sw      a3,4(a1)
>   113760:       00c02021        move    a0,a2
>   113764:       24021018        li      v0,4120
>   113768:       0000000c        syscall
>   11376c:       14e00006        bnez    a3,113788 <__clone+0x58>
>   113770:       00000000        nop
>   113774:       10400008        beqz    v0,113798 <__thread_start>
>   113778:       00000000        nop
>   11377c:       03e00008        jr      ra
>   113780:       27bd0018        addiu   sp,sp,24
>   113784:       00000000        nop
>   113788:       8f998290        lw      t9,-32112(gp)
>   11378c:       27bd0018        addiu   sp,sp,24
>   113790:       03200008        jr      t9
>   113794:       00000000        nop
> 00113798 <__thread_start>:
>   113798:       afbc0014        sw      gp,20(sp)
>   11379c:       8fb90000        lw      t9,0(sp)
>   1137a0:       8fa40004        lw      a0,4(sp)
>   1137a4:       0320f809        jalr    t9
>   1137a8:       00000000        nop
>   1137ac:       8fbc0014        lw      gp,20(sp)
>   1137b0:       00402021        move    a0,v0
>   1137b4:       8f9991f4        lw      t9,-28172(gp)
>   1137b8:       00000000        nop
>   1137bc:       0320f809        jalr    t9
>   1137c0:       00000000        nop
> 
> The last jalr is exit.
> 
> And here's its PDR (little endian):
>  0e0c0 30371100 00000000 00000000 00000000  07..............
>  0e0d0 00000000 10000000 1d000000 1d000000  ................
>  0e0e0 98371100 00000000 00000000 00000000  .7..............
>  0e0f0 00000000 00000000 00000000 00000000  ................
> 
> Frame register is sp, stack size is 16, no registers saved for clone. 
> Then stack size 0, no registers saved, frame register is zero for
> __thread_start, which is where we are.  I don't know why that didn't
> terminate the stack trace right away, I'll bring the board back up
> tomorrow to find out.

OK, two things.


The first problem is that __thread_start is a local symbol, and the
copy of the library I'm using on my target is stripped of local symbols
but has a .pdr section.  We pass the PC to find_proc_desc, which is
fine, but non_heuristic_proc_desc does:
  find_pc_partial_function (pc, NULL, &startaddr, NULL);
and looks for a PDR associated with startaddr.  Oopsie.

The problem is, GAS's .pdr information does not include a size or an
end address.  It's slightly risky to search for the latest .pdr entry
preceding PC, because that might be for a completely different
function, but not very risky - .frame will generally be used
consistently if it was used at all.

I believe the attached patch is safe in a superset of circumstances to
where the previous code was safe.  It causes us to select the correct
.pdr entry in this circumstance.  What do you think of it?

[BTW: I verified that without a .pdr section we notice that we can't
find a prologue for clone, and give up unwinding semi-gracefully.  I
also verified that with local symbols, we find the correct .pdr.]


The other thing:

I hypothesize that if two consecutive frames, regardless of their type,
claim to save the PC register at the same location, then unwinding is
hosed.  If we changed the interface for gdbarch_unwind_pc to return
information on where, if anywhere, the PC was unwound from, we could
check this.  This handles sigtramp frames (which save the PC at a
particular memory buffer), dummy frames (which would say not_lval, i.e.
the PC is not saved in a simple fashion, i.e. skip this test), typical
normal frames (the sentinel frame says it's saved in a register, the
outermost frame says it's saved in the return address register or in a
local stack slot, normal frames which called other normal frames will
have it in a local stack slot.

Obviously for some architecture with a reconstructed not_lval PC this
wouldn't work, but I think it could still be useful as a safety check
on the unwinder.  Not a safe termination - this sort of hack doesn't
have any business being relied on - but at least a call to error() to
prevent spiralling off into infinity.

Does my hypothesis make sense?  Shall I try to implement it?

-- 
Daniel Jacobowitz
MontaVista Software                         Debian GNU/Linux Developer

2004-03-08  Daniel Jacobowitz  <drow@mvista.com>

	* mips-tdep.c (non_heuristic_proc_desc): Search using the specified
	PC rather than the partial function start address.  Use the start
	address to sanity check the found PDR.

Index: mips-tdep.c
===================================================================
RCS file: /big/fsf/rsync/src-cvs/src/gdb/mips-tdep.c,v
retrieving revision 1.283
diff -u -p -r1.283 mips-tdep.c
--- mips-tdep.c	17 Feb 2004 15:21:21 -0000	1.283
+++ mips-tdep.c	8 Mar 2004 15:33:25 -0000
@@ -2289,10 +2289,6 @@ non_heuristic_proc_desc (CORE_ADDR pc, C
   if (DEPRECATED_PC_IN_CALL_DUMMY (pc, 0, 0))
     return NULL;
 
-  find_pc_partial_function (pc, NULL, &startaddr, NULL);
-  if (addrptr)
-    *addrptr = startaddr;
-
   priv = NULL;
 
   sec = find_pc_section (pc);
@@ -2352,38 +2348,60 @@ non_heuristic_proc_desc (CORE_ADDR pc, C
 	{
 	  int low, mid, high;
 	  char *ptr;
+	  CORE_ADDR pdr_pc;
 
 	  low = 0;
 	  high = priv->size / 32;
 
+	  /* First, find the last .pdr entry starting at or before PC.  */
 	  do
 	    {
-	      CORE_ADDR pdr_pc;
-
 	      mid = (low + high) / 2;
 
 	      ptr = priv->contents + mid * 32;
 	      pdr_pc = bfd_get_signed_32 (sec->objfile->obfd, ptr);
 	      pdr_pc += ANOFFSET (sec->objfile->section_offsets,
 				  SECT_OFF_TEXT (sec->objfile));
-	      if (pdr_pc == startaddr)
+	      if (pdr_pc == pc)
 		break;
-	      if (pdr_pc > startaddr)
+	      if (pdr_pc > pc)
 		high = mid;
 	      else
 		low = mid + 1;
 	    }
 	  while (low != high);
 
-	  if (low != high)
+	  /* If we got an exact match on PC, low != high.  Otherwise,
+	     both low and high point one past the PDR of interest.  */
+	  if (low == high && low > 0)
+	    {
+	      ptr = priv->contents + (low - 1) * 32;
+	      pdr_pc = bfd_get_signed_32 (sec->objfile->obfd, ptr);
+	      pdr_pc += ANOFFSET (sec->objfile->section_offsets,
+				  SECT_OFF_TEXT (sec->objfile));
+	    }
+
+	  /* We don't have a range, so we have no way to know for sure
+	     whether we're in the correct PDR or a PDR for a preceding
+	     function and the current function was a stripped local
+	     symbol.  But if the PDR's PC is at least as great as the
+	     best guess from the symbol table, assume that it does cover
+	     the right area; if a .pdr section is present at all then
+	     nearly every function will have an entry.  */
+
+	  find_pc_partial_function (pc, NULL, &startaddr, NULL);
+	  if (startaddr <= pdr_pc)
 	    {
 	      struct symbol *sym = find_pc_function (pc);
 
+	      if (addrptr)
+		*addrptr = pdr_pc;
+
 	      /* Fill in what we need of the proc_desc.  */
 	      proc_desc = (mips_extra_func_info_t)
 		obstack_alloc (&sec->objfile->objfile_obstack,
 			       sizeof (struct mips_extra_func_info));
-	      PROC_LOW_ADDR (proc_desc) = startaddr;
+	      PROC_LOW_ADDR (proc_desc) = pdr_pc;
 
 	      /* Only used for dummy frames.  */
 	      PROC_HIGH_ADDR (proc_desc) = 0;
@@ -2412,6 +2430,10 @@ non_heuristic_proc_desc (CORE_ADDR pc, C
 
   if (b == NULL)
     return NULL;
+
+  find_pc_partial_function (pc, NULL, &startaddr, NULL);
+  if (addrptr)
+    *addrptr = startaddr;
 
   if (startaddr > BLOCK_START (b))
     {


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]