This is the mail archive of the gdb-patches@sources.redhat.com mailing list for the GDB project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]

Re: [RFA] rs6000-tdep.c: Improve prologue handling with code motion.


Based on experience, I am convinced that correct prologue analysis via the
generated machine code is impossible with highly optimizing compilers.

You will always find examples where you either scan too far or where you stop
before the real end of the prologue.
If you want to get it right, you definitely need help from the compiler,
using live ranges or similar methods.

The proposed patch doesn't try to solve the problem of correct prologue
analysis, it is only trying to get proper backtraces from highly optimized
functions, without any help from the compiler debug info (as is the case with
native system libraries).
For backtraces, we only need the correct recognition of the current state of
the frame setup and the saved caller pc.
I was also trying to minimize the risk of overscan and performance impacts.

The proposed patch will not be able to correctly determine the end of
the prologue in your example (with or without the find_pc_line test),
as the prologue sequence has already set up the required information
(new frame and saved pc) in skip_two+8, and you will not reach the
find_pc_line test anyway.


I don't know if we should include the find_pc_line test or not.

Would you be willing to run the GDB testsuite with your version of the
compiler, using -O2 ?
It will almost certainly cause many failures, but it would be interesting,
if you get better results with my original patch, or still even better
results when you leave out the find_pc_line test.

With stock gcc-2.95.2 and -O2 the results were disappointing.
gcc-2.95.2 doesn't seem to move instructions before the `setup new frame' or
`save pc' instructions, so the testsuite gets all the same results, with or
without my patch, with or without the find_pc_line test.
On the other hand we have prove that the patch doesn't introduce any
testsuite regressions, even with -O2.

> On Nov 6,  8:15am, Peter.Schauer wrote:
> 
> > It does work without the section.
> > I just added it to avoid unnecessary additional slow remote target reads,
> > which had always been a concern in the past.
> > 
> > I wanted to make sure that my patches don't add extra overhead in the
> > usual case (have line number info), and that we don't try to second
> > guess the information from the compiler. In addition it minimizes the
> > impact of the change (the old code did unconditionally break out of the
> > loop as well if a non-prologue insn was encountered).
> > 
> > I could leave the section out, but I'd appreciate if you could provide
> > an example to prove that it should be left out.
> 
> Consider the following program:
> 
> --- rev.c ---
> #include <stdio.h>
> #include <stdlib.h>
> 
> struct s
>   {
>     char *str;
>     struct s *next;
>   };
> 
> struct s n3 = { "c", 0 };
> struct s n2 = { "b", &n3 };
> struct s n1 = { "a", &n2 };
> struct s *nodes = &n1;
> 
> void print_nodes (struct s *s);
> 
> int
> main (int argc, char **argv)
> {
>   print_nodes (nodes);
>   exit (0);
> }
> 
> void
> print_nodes (struct s *s)
> {
>   if (s)
>     {
>       print_nodes (s->next);
>       printf ("%s\n", s->str);
>     }
> }
> 
> void
> p (double *d1, double *d2)
> {
>   *d1 = 2.0;
>   *d2 = 3.0;
> }
> 
> void q (double d)
> {
> }
> 
> void r (void)
> {
> }
> 
> void
> skip_two (struct s *s)
> {
>   print_nodes (s->next->next);
>   {
>     double d1, d2, d3, d4;
>     p (&d1,&d2);
>     d3 = d1 + d2;
>     q (d3);
>     d4 = d3 + d1 + d2;
>     q (d3);
>   }
> }
> --- end rev.c ---
> 
> I compiled this with a version of gcc built from an internal Red Hat
> repository.  I think you'll be able to get similar results if you
> use a recent development snapshot from sourceware.  I used
> 
>     gcc -O2 -g rev.c -o rev
> 
> to do the build.
> 
> Here's the beginning of print_nodes():
> 
>     (gdb) x/10i print_nodes
>     0x100004e0 <print_nodes>:       stwu    r1,-16(r1)
>     0x100004e4 <print_nodes+4>:     mflr    r0
>     0x100004e8 <print_nodes+8>:     stw     r31,12(r1)
>     0x100004ec <print_nodes+12>:    mr.     r31,r3
>     0x100004f0 <print_nodes+16>:    stw     r0,20(r1)
>     0x100004f4 <print_nodes+20>:    beq     0x10000514 <print_nodes+52>
>     0x100004f8 <print_nodes+24>:    lwz     r3,4(r31)
>     0x100004fc <print_nodes+28>:    bl      0x100004e0 <print_nodes>
>     0x10000500 <print_nodes+32>:    lis     r3,4096
>     0x10000504 <print_nodes+36>:    lwz     r4,0(r31)
> 
> Note that the compiler has moved the test (the "mr." instruction) into
> the prologue.  GDB is aware of this too.
> 
>     (gdb) info line *print_nodes
>     Line 26 of "rev.c" starts at address 0x100004e0 <print_nodes>
>        and ends at 0x100004ec <print_nodes+12>.
>     (gdb) info line *print_nodes+12
>     Line 27 of "rev.c" starts at address 0x100004ec <print_nodes+12>
>        and ends at 0x100004f0 <print_nodes+16>.
>     (gdb) info line *print_nodes+16
>     Line 26 of "rev.c" starts at address 0x100004f0 <print_nodes+16>
>        and ends at 0x100004f4 <print_nodes+20>.
>     (gdb) info line *print_nodes+20
>     Line 27 of "rev.c" starts at address 0x100004f4 <print_nodes+20>
>        and ends at 0x100004f8 <print_nodes+24>.
> 
> This was my original example, but I remembered that we have some
> special purpose code in the prologue scanner in rs6000-tdep.c which
> accounts for this case.  I needed a more convincing example, so I
> wrote skip_two().  The idea is that if we have to do a double
> dereference early on in the function body, the compiler may choose to
> move one or more of the dereference instructions into the prologue. 
> The additional junk is to make sure that we have a reasonably sized
> prologue to move instructions into.
> 
> Here is the beginning of skip_two():
>     (gdb) x/10i skip_two
>     0x10000554 <skip_two>:  stwu    r1,-32(r1)
>     0x10000558 <skip_two+4>:        mflr    r0
>     0x1000055c <skip_two+8>:        stw     r0,36(r1)
>     0x10000560 <skip_two+12>:       lwz     r9,4(r3)
>     0x10000564 <skip_two+16>:       stfd    f31,24(r1)
>     0x10000568 <skip_two+20>:       lwz     r3,4(r9)
>     0x1000056c <skip_two+24>:       bl      0x100004e0 <print_nodes>
>     0x10000570 <skip_two+28>:       addi    r3,r1,8
>     0x10000574 <skip_two+32>:       addi    r4,r1,16
>     0x10000578 <skip_two+36>:       bl      0x10000528 <p>
> 
> Note that the lwz instruction is a part of the first dereference and
> that it occurs before the prologue is complete.  (The stfd instruction
> is the last instruction of the prologue.)
> 
> Here, again, is what GDB knows about the lines associated with these
> instructions:
> 
>     (gdb) info line *skip_two
>     Line 51 of "rev.c" starts at address 0x10000554 <skip_two>
>        and ends at 0x10000560 <skip_two+12>.
>     (gdb) info line *skip_two+12
>     Line 52 of "rev.c" starts at address 0x10000560 <skip_two+12>
>        and ends at 0x10000564 <skip_two+16>.
>     (gdb) info line *skip_two+16
>     Line 51 of "rev.c" starts at address 0x10000564 <skip_two+16>
>        and ends at 0x10000568 <skip_two+20>.
>     (gdb) info line *skip_two+20
>     Line 52 of "rev.c" starts at address 0x10000568 <skip_two+20>
>        and ends at 0x10000570 <skip_two+28>.
> 
> It is my contention that the following section from your proposed
> patch...
> 
> ! 	  if (num_skip_non_prologue_insns == 0 && lim_pc == 0)
> ! 	    {
> ! 	      /* Stop scan if we are looking for the end of the prologue
> ! 		 and we have line numbers for the function
> ! 		 The current result is good enough, and the compiler will
> ! 		 hopefully help us to get better results via the line number
> ! 		 info.  */
> ! 	      struct symtab_and_line sal;
> ! 	      sal = find_pc_line (pc, 0);
> ! 	      if (sal.line != 0)
> ! 		break;
> ! 	    }
> 
> ...would cause the prologue scanner to stop too soon on skip_two().  I.e,
> it would incorrectly indicate that the stw instruction at skip_two+8 is
> the last prologue instruction when in fact it is actually the stfd at
> skip_two+16.  (If you wish, I can apply your patch to verify this.)
> 
> Kevin
> 
> 


-- 
Peter Schauer			pes@regent.e-technik.tu-muenchen.de

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]