[PATCH] gdb: Don't skip prologue for explicit line breakpoints in assembler

Andrew Burgess andrew.burgess@embecosm.com
Thu Jun 20 23:23:00 GMT 2019


* Andrew Burgess <andrew.burgess@embecosm.com> [2019-06-20 21:57:59 +0100]:

> * Kevin Buettner <kevinb@redhat.com> [2019-06-19 18:11:47 -0700]:
> 
> > On Wed, 12 Jun 2019 13:34:03 +0100
> > Andrew Burgess <andrew.burgess@embecosm.com> wrote:
> > 
> > > It was observed that in some cases, placing a breakpoint in an
> > > assembler file using filename:line-number syntax would result in the
> > > breakpoint being placed at a different line within the file.
> > > 
> > > For example, consider this x86-64 assembler:
> > > 
> > >     test:
> > >             push   %rbp		/* Break here.  */
> > >             mov    %rsp, %rbp
> > >             nop			/* Stops here.  */
> > > 
> > > The user places the breakpoint using file:line notation targeting the
> > > line marked 'Break here', GDB actually stops at the line marked 'Stops
> > > here'.
> > > 
> > > The reason is that the label 'test' is identified as the likely start
> > > of a function, and the call to symtab.c:skip_prologue_sal causes GDB
> > > to skip forward over the instructions that GDB believes to be part of
> > > the prologue.
> > > 
> > > I believe however, that when debugging assembler code, where the user
> > > has instruction-by-instruction visibility, if they ask for a specific
> > > line, GDB should (as far as possible) stop on that line, and not
> > > perform any prologue skipping.  I don't believe that the behaviour of
> > > higher level languages should change, in these cases skipping the
> > > prologue seems like the correct thing to do.
> > 
> > I agree with all of this.
> > 
> > > In order to implement this change I needed to extend our current
> > > tracking of when the user has requested an explicit line number.  We
> > > already tracked this in some cases, but not in others (see the changes
> > > in linespec.c).  However, once I did this I started to see some
> > > additional failures (in tests gdb.base/break-include.exp
> > > gdb.base/ending-run.exp gdb.mi/mi-break.exp gdb.mi/mi-reverse.exp
> > > gdb.mi/mi-simplerun.exp) where we currently expected a breakpoint
> > > placed at one file and line number to be updated to reference a
> > > different line number, this was fixed by removing some code in
> > > symtab.c:skip_prologue_sal.  My concern here is that removing this
> > > check didn't cause anything else to fail.
> > 
> > Did you investigate the reason for the failures with this hunk
> > left in place?...
> > 
> > > @@ -3812,12 +3821,6 @@ skip_prologue_sal (struct symtab_and_line *sal)
> > >  
> > >    sal->pc = pc;
> > >    sal->section = section;
> > > -
> > > -  /* Unless the explicit_line flag was set, update the SAL line
> > > -     and symtab to correspond to the modified PC location.  */
> > > -  if (sal->explicit_line)
> > > -    return;
> > > -
> > >    sal->symt> FAIL: gdb.base/break-include.exp: break break-include.c:53
> > ab = start_sal.symtab;
> > >    sal->line = start_sal.line;
> > >    sal->end = start_sal.end;
> > 
> > The rest of the patch looks fine to me.  Deleting those lines might
> > be okay also, but I'd like to understand why adding additional
> > explicit line number tracking caused these failures:
> > 
> > FAIL: gdb.base/break-include.exp: break break-include.c:53
> > FAIL: gdb.base/ending-run.exp: Cleared 2 by line
> > FAIL: gdb.base/ending-run.exp: clear 2 by default
> 
> Thanks for taking a look at this.
> 
> Just to be clear, this is the hunk that is in question (in symtab.c):
> 
>   @@ -3812,12 +3821,6 @@ skip_prologue_sal (struct symtab_and_line *sal)
> 
>      sal->pc = pc;
>      sal->section = section;
>   -
>   -  /* Unless the explicit_line flag was set, update the SAL line
>   -     and symtab to correspond to the modified PC location.  */
>   -  if (sal->explicit_line)
>   -    return;
>   -
>      sal->symtab = start_sal.symtab;
>      sal->line = start_sal.line;
>      sal->end = start_sal.end;
> 
> 
> If we take the first of these 'gdb.base/break-include.exp: break
> break-include.c:53', then in a pre-patched test run the log looks like
> this:
> 
>   (gdb) break break-include.c:53
>   Breakpoint 1 at 0x4004af: file /path/to/gdb/src/gdb/testsuite/gdb.base/break-include.inc, line 18.
>   (gdb) PASS: gdb.base/break-include.exp: break break-include.c:53
> 
> And in a post-patch world, but with the hunk above removed here's the
> same part of the log file:
> 
>   (gdb) break break-include.c:53
>   Breakpoint 1 at 0x4004af: file /path/to/gdb/src/gdb/testsuite/gdb.base/break-include.c, line 54.
>   (gdb) FAIL: gdb.base/break-include.exp: break break-include.c:53
> 
> What we see is that in the failing case the line number has failed to
> update from break-include.c:53 to break-include.inc:18, instead it has
> updated to break-include.c:54.
> 
> To understand what's going on we need to consider two steps, first in
> convert_linespec_to_sals the file and line is used to index into the
> line table, this is where the break-include.c:54 comes from, there is
> no entry for line 53, and 54 is the next available line.
> 
> Then skip_prologue_sal is called, this is where we move forward to
> break-include.inc line 18, and this is where the difference is.
> 
> In the pre-patch world, the sal that is passe into skip_prologue_sal
> is not marked as explicit_line, and so we successfully update the file
> and line.
> 
> In the post patch world, the sal now is marked as explicit_line, so
> the above check triggers and GDB doesn't update the file/line in the
> sal, this leaves us stuck on break-include.c:54.
> 
> The other two failures all stem from the exact same problem, in
> gdb.base/ending-run.exp many breakpoints are placed, and then cleared
> using the 'clear' command.  One of the breakpoints (breakpoint 4) is
> placed at the wrong file/line as a result of not updating, this then
> causes the 'clear' tests to not clear the expected breakpoints.
> 
> What worries more about the above hunk is that it never triggers
> during testing (in a pre-patched world).  I applied this patch to
> master:
> 
>   diff --git a/gdb/symtab.c b/gdb/symtab.c
>   index 4920d94a247..5cd5bb69147 100644
>   --- a/gdb/symtab.c
>   +++ b/gdb/symtab.c
>   @@ -3816,7 +3816,11 @@ skip_prologue_sal (struct symtab_and_line *sal)
>      /* Unless the explicit_line flag was set, update the SAL line
>         and symtab to correspond to the modified PC location.  */
>      if (sal->explicit_line)
>   -    return;
>   +    {
>   +      fprintf ("Got here!\n");

Ooops, that should be 'fprintf (stderr, "Got here!\n");' of course.

>   +      abort ();
>   +      return;
>   +    }
>    
>      sal->symtab = start_sal.symtab;
>      sal->line = start_sal.line;
> 
> And all the tests still pass.  This code has been in place for ~9
> years and unfortunately didn't have any tests associated when it was
> added.
> 
> I've spent some time trying to figure out what conditions might need
> to be true in order to trigger this code, but so far I've not managed
> to figure it out - any suggestions would be appreciated.

I spent some more time trying to find a path that would call both
'decode_digits_list_mode' and then 'skip_prologue_sal', but I still
can't find one.

Looking back at how the explicit_line flag was originally used when
it was added in commit ed0616c6b78a0966, things have changed quite a
bit in the 10+ years since.  There were some tests added along with
the explicit_line flag (gdb.cp/mb-*.exp) and these all pass both in
master and in my patched branch.

My current thinking is that the explicit_line flag was no longer doing
anything useful in master, but if someone disagrees I'd love to
understand more about this.

Thanks,
Andrew



More information about the Gdb-patches mailing list