[PATCH 1/2 version 3] fix for gdb.reverse/finish-precsave.exp and gdb.reverse/finish-reverse.exp

Tue Jan 24 14:08:05 GMT 2023

On 2023-01-23 9:13 p.m., Carl Love wrote:
> Pedro:
> 
> On Mon, 2023-01-23 at 19:17 +0000, Pedro Alves wrote:
>>> Currently on X86, when executing the finish command in reverse, gdb
>>> does a
>>> single step from the first instruction in the callee to get back to
>>> the
>>> caller.  GDB stops on the last instruction in the source code line
>>> where
>>> the call was made.  When stopped at the last instruction of the
>>> source code
>>> line, a reverse next or step command will stop at the first
>>> instruction
>>> of the same source code line thus requiring two step/next commands
>>> to
>>> reach the previous source code line.  It should only require one
>>> step/next
>>> command to reach the previous source code line.
>>>
>>> By contrast, a reverse next or step command from the first line in
>>> a
>>> function stops at the first instruction in the source code line
>>> where the
>>> call was made.
>>
>> I'd think this was on purpose.  Note that next/step/reverse-
>> {next/step} are line-oriented
>> stepping commands, they step/next until the previous/next line. 
>> While "finish" is described
>> as undoing the _function call_.
>>
>> The manual says:
>>
>>  reverse-finish
>>  Just as the finish command takes you to the point where the current
>> function returns,
>>  reverse-finish takes you to the point where it was called. Instead
>> of ending up at the end of
>>  the current function invocation, you end up at the beginning.
>>
>> Say you have a line with multiple statements involving multiple
>> function calls.
>> The simplest would be:
>>
>>   func1 ();  func2 ();
>>
>> Say you'd stopped inside 'func2'.  If you do finish there, in current
>> master gdb
>> stops at the call to 'func2', and you can then decide to reverse step
>> into 'func1'.
> 
> I don't think you followed the issue.  

Totally possible!

> 
> So, if you are in func2 and do a reverse-finish, without the patch gdb
> stops on the last instruction for the line that calls func2.  

Right.

> Now if
> you issue a reverse-step, you stop at the first instruction for the
> call to func2, i.e. you are still on the same source code line.  

Wait.  That right there sounds bogus.  The source line looks like:

   func1 ();  func2 ();

so stepping backwards over that line should always stop at the first
instruction of the line, not in the middle.  Let's simplify this.

Here's the full source code of my example:

(gdb) list 1
1       void func1 ()
2       {
3       }
4
5       void func2 ()
6       {
7       }
8
9       int main ()
10      {
11        func1 (); func2 ();
12      }

Compiled with:

 $ gcc reverse.c -o reverse -g3 -O0
 $ gcc -v
 ...
 gcc version 11.3.0 (Ubuntu 11.3.0-1ubuntu1~22.04)

Now let's debug it with target record, using current gdb git master (f3d8ae90b236),
without your patch:

 $ gdb ~/reverse
 GNU gdb (GDB) 14.0.50.20230124-git
 ...
 Reading symbols from /home/pedro/reverse...
 (gdb) start
 Temporary breakpoint 1 at 0x1147: file reverse.c, line 11.
 Starting program: /home/pedro/reverse 
 [Thread debugging using libthread_db enabled]
 Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".

 Temporary breakpoint 1, main () at reverse.c:11
 11        func1 (); func2 ();
 (gdb) record 

 (gdb) disassemble /s
 Dump of assembler code for function main:
 reverse.c:
 10      {
    0x000055555555513f <+0>:     endbr64
    0x0000555555555143 <+4>:     push   %rbp
    0x0000555555555144 <+5>:     mov    %rsp,%rbp

 11        func1 (); func2 ();
 => 0x0000555555555147 <+8>:     mov    $0x0,%eax
    0x000055555555514c <+13>:    call   0x555555555129 <func1>
    0x0000555555555151 <+18>:    mov    $0x0,%eax
    0x0000555555555156 <+23>:    call   0x555555555134 <func2>
    0x000055555555515b <+28>:    mov    $0x0,%eax

 12      }
    0x0000555555555160 <+33>:    pop    %rbp
    0x0000555555555161 <+34>:    ret
 End of assembler dump.

 (gdb) n
 12      }

So far so good, a "next" stepped over the whole of line 11 and stopped at line 12.

Let's confirm where we are now:

 (gdb) disassemble /s
 Dump of assembler code for function main:
 reverse.c:
 10      {
    0x000055555555513f <+0>:     endbr64
    0x0000555555555143 <+4>:     push   %rbp
    0x0000555555555144 <+5>:     mov    %rsp,%rbp

 11        func1 (); func2 ();
    0x0000555555555147 <+8>:     mov    $0x0,%eax
    0x000055555555514c <+13>:    call   0x555555555129 <func1>
    0x0000555555555151 <+18>:    mov    $0x0,%eax
    0x0000555555555156 <+23>:    call   0x555555555134 <func2>
    0x000055555555515b <+28>:    mov    $0x0,%eax

 12      }
 => 0x0000555555555160 <+33>:    pop    %rbp
    0x0000555555555161 <+34>:    ret
 End of assembler dump.

Good, we're at the first instruction of line 12.

Now let's undo the "next", with "reverse-next":

 (gdb) reverse-next
 11        func1 (); func2 ();

Seemingly stopped at line 11.  Let's see exactly where:

 (gdb) disassemble /s
 Dump of assembler code for function main:
 reverse.c:
 10      {
    0x000055555555513f <+0>:     endbr64
    0x0000555555555143 <+4>:     push   %rbp
    0x0000555555555144 <+5>:     mov    %rsp,%rbp

 11        func1 (); func2 ();
    0x0000555555555147 <+8>:     mov    $0x0,%eax
    0x000055555555514c <+13>:    call   0x555555555129 <func1>
 => 0x0000555555555151 <+18>:    mov    $0x0,%eax
    0x0000555555555156 <+23>:    call   0x555555555134 <func2>
    0x000055555555515b <+28>:    mov    $0x0,%eax

 12      }
    0x0000555555555160 <+33>:    pop    %rbp
    0x0000555555555161 <+34>:    ret
 End of assembler dump.
 (gdb) 

And lo, we stopped in the middle of line 11!  That is a bug, we should have stepped
back all the way to the beginning of the line.  The "reverse-next" should have fully
undone the prior "next" command.  Here's the same thing without the distracting
disassemble commands:

 (gdb) b 11
 Breakpoint 1 at 0x1147: file reverse.c, line 11.
 (gdb) r
 Starting program: /home/pedro/reverse 
 [Thread debugging using libthread_db enabled]
 Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".

 Breakpoint 1, main () at reverse.c:11
 11        func1 (); func2 ();
 (gdb) p $pc
 $1 = (void (*)()) 0x555555555147 <main+8>
 (gdb) record 
 (gdb) next
 12      }
 (gdb) reverse-next
 11        func1 (); func2 ();
 (gdb) p $pc
 $2 = (void (*)()) 0x555555555151 <main+18>
 (gdb) 

This:

 next -> reverse-next -> next -> reverse-next

... should leave you at the same instruction.  But it doesn't in this example!

How does this happen?  Let's look at the line table as seen by GDB:

 (gdb) maint info line-table reverse.c
 objfile: /home/pedro/reverse ((struct objfile *) 0x55dd5df77c50)
 compunit_symtab: reverse.c ((struct compunit_symtab *) 0x55dd5de6b2e0)
 symtab: /home/pedro/reverse.c ((struct symtab *) 0x55dd5de6b360)
 linetable: ((struct linetable *) 0x55dd5dfd3290):
 INDEX  LINE   ADDRESS            IS-STMT PROLOGUE-END 
 0      2      0x0000555555555129 Y                    
 1      3      0x0000555555555131 Y                    
 2      6      0x0000555555555134 Y                    
 3      7      0x000055555555513c Y                    
 4      10     0x000055555555513f Y                    
 5      11     0x0000555555555147 Y                      <<< here 
 6      11     0x0000555555555151 Y                      <<< here
 7      12     0x0000555555555160 Y                    
 8      END    0x0000555555555162 Y                    

Ah, there are two entries for line 11, both marked with IS-STMT.  So when
stepping backward GDB only considered the region with index 6, that's why it
stopped at 0x0000555555555151.

Let's look at what we get with clang instead (Ubuntu clang version 14.0.0-1ubuntu1) :

 (gdb) maintenance info line-table reverse.c
 objfile: /home/pedro/reverse.clang ((struct objfile *) 0x5576be591ca0)
 compunit_symtab: reverse.c ((struct compunit_symtab *) 0x5576be485300)
 symtab: /home/pedro/reverse.c ((struct symtab *) 0x5576be485380)
 linetable: ((struct linetable *) 0x5576be5ec8e0):
 INDEX  LINE   ADDRESS            IS-STMT PROLOGUE-END 
 0      2      0x0000555555555130 Y                    
 1      3      0x0000555555555134 Y       Y            
 2      6      0x0000555555555140 Y                    
 3      7      0x0000555555555144 Y       Y            
 4      10     0x0000555555555150 Y                    
 5      11     0x0000555555555154 Y       Y            
 6      11     0x0000555555555159                      
 7      12     0x000055555555515e Y                    
 8      END    0x0000555555555162 Y                    

Note no IS-STMT for the second range.  And let's look at how GDB behaves with it:

 (gdb) b 11
 Breakpoint 1 at 0x1154: file reverse.c, line 11.
 (gdb) r
 Starting program: /home/pedro/reverse.clang 
 [Thread debugging using libthread_db enabled]
 Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".

 Breakpoint 1, main () at reverse.c:11
 11        func1 (); func2 ();
 (gdb) record
 (gdb) p $pc
 $1 = (void (*)()) 0x555555555154 <main+4>
 (gdb) n
 12      }
 (gdb) reverse-next

 No more reverse-execution history.
 main () at reverse.c:11
 11        func1 (); func2 ();
 (gdb) p $pc
 $2 = (void (*)()) 0x555555555154 <main+4>
 (gdb) 

Bingo.  reverse-next fully stepped the whole line backwards.

> You
> have not stepped back into func1 like you wanted to.  Now you have to
> issue a second reverse-step to get into func1.   With the patch, you
> issue the reverse-finish from func2 and stop at the first instruction
> in the line that calls func2.  Now when you issue the reverse-step you
> step back into func1 as expected.
> 

So this fix is assuming that the reverse step stops in the middle of a
line, which I think is the real bug to fix.  Once that is fixed, then
you will no longer need two reverse-steps after reverse-finish.

Here's what you get with current master without your patch, but using
the test program compiled with clang.  Actually, let's use a slightly
modified program to force clang to emit some instructions between
the two calls.  Like this:

 $ cat reverse.c
 void func1 (int i)
 {
 }

 void func2 (int i)
 {
 }

 int main (int argc, char **argv)
 {
   func1 (argc); func2 (argc);
 }

Alright, here's the gdb session, with clang, no gdb patch:

 Temporary breakpoint 1, main (argc=1, argv=0x7fffffffdc48) at reverse.c:11
 11        func1 (argc); func2 (argc);
 (gdb) disassemble /s
 Dump of assembler code for function main:
 reverse.c:
 10      {
    0x0000555555555150 <+0>:     push   %rbp
    0x0000555555555151 <+1>:     mov    %rsp,%rbp
    0x0000555555555154 <+4>:     sub    $0x10,%rsp
    0x0000555555555158 <+8>:     mov    %edi,-0x4(%rbp)
    0x000055555555515b <+11>:    mov    %rsi,-0x10(%rbp)

 11        func1 (argc); func2 (argc);
 => 0x000055555555515f <+15>:    mov    -0x4(%rbp),%edi
    0x0000555555555162 <+18>:    call   0x555555555130 <func1>
    0x0000555555555167 <+23>:    mov    -0x4(%rbp),%edi
    0x000055555555516a <+26>:    call   0x555555555140 <func2>

 12      }
    0x000055555555516f <+31>:    xor    %eax,%eax
    0x0000555555555171 <+33>:    add    $0x10,%rsp
    0x0000555555555175 <+37>:    pop    %rbp
    0x0000555555555176 <+38>:    ret
 End of assembler dump.
 (gdb) record 
 (gdb) b func2
 Breakpoint 2 at 0x555555555147: file reverse.c, line 7.
 (gdb) c
 Continuing.

 Breakpoint 2, func2 (i=1) at reverse.c:7
 7       }
 (gdb) reverse-finish 
 Run back to call of #0  func2 (i=1) at reverse.c:7
 0x000055555555516a in main (argc=1, argv=0x7fffffffdc48) at reverse.c:11
 11        func1 (argc); func2 (argc);
 (gdb) disassemble /s
 Dump of assembler code for function main:
 reverse.c:
 10      {
    0x0000555555555150 <+0>:     push   %rbp
    0x0000555555555151 <+1>:     mov    %rsp,%rbp
    0x0000555555555154 <+4>:     sub    $0x10,%rsp
    0x0000555555555158 <+8>:     mov    %edi,-0x4(%rbp)
    0x000055555555515b <+11>:    mov    %rsi,-0x10(%rbp)

 11        func1 (argc); func2 (argc);
    0x000055555555515f <+15>:    mov    -0x4(%rbp),%edi
    0x0000555555555162 <+18>:    call   0x555555555130 <func1>
    0x0000555555555167 <+23>:    mov    -0x4(%rbp),%edi
 => 0x000055555555516a <+26>:    call   0x555555555140 <func2>

 12      }
    0x000055555555516f <+31>:    xor    %eax,%eax
    0x0000555555555171 <+33>:    add    $0x10,%rsp
    0x0000555555555175 <+37>:    pop    %rbp
    0x0000555555555176 <+38>:    ret
 End of assembler dump.
 (gdb) reverse-step
 func1 (i=1) at reverse.c:3
 3       }
 (gdb) 

Note how a single reverse-step after the reverse-finish immediately
stepped backwards into func1.  Exactly how I describing it originally.

With your patch, you'd break reverse-finish with clang:

 (gdb) record 
 (gdb) b func2
 Breakpoint 2 at 0x555555555147: file reverse.c, line 7.
 (gdb) c
 Continuing.

 Breakpoint 2, func2 (i=1) at reverse.c:7
 7       }
 (gdb) reverse-finish 
 Run back to call of #0  func2 (i=1) at reverse.c:7
 func1 (i=1) at reverse.c:3
 3       }
 (gdb) 

GDB stopped at line 3, info func1 which means it stepped too far without
stopping at func2's call site.

GDB is misbehaving with GCC's debug info.  I suspect the reason we get
the two line entries for line 11 and both with IS-STMT is because GCC emits
column debug info nowadays by default.   Here's what e.g.,
"llvm-dwarfdump --debug-line reverse" shows:

 ~~~
 Address            Line   Column File   ISA Discriminator Flags
 ------------------ ------ ------ ------ --- ------------- -------------
 0x0000000000001129      2      1      1   0             0  is_stmt
 0x0000000000001131      3      1      1   0             0  is_stmt
 0x0000000000001134      6      1      1   0             0  is_stmt
 0x000000000000113c      7      1      1   0             0  is_stmt
 0x000000000000113f     10      1      1   0             0  is_stmt
 0x0000000000001147     11      3      1   0             0  is_stmt
 0x0000000000001151     11     13      1   0             0  is_stmt
 0x0000000000001160     12      1      1   0             0  is_stmt
 0x0000000000001162     12      1      1   0             0  is_stmt end_sequence
 ~~~

We can try disabling that with -gno-column-info, let's see what we get:

 (gdb) maint info line-table reverse.c
 objfile: /home/pedro/reverse.nocol ((struct objfile *) 0x5611464f6c10)
 compunit_symtab: reverse.c ((struct compunit_symtab *) 0x5611463ea2e0)
 symtab: /home/pedro/reverse.c ((struct symtab *) 0x5611463ea360)
 linetable: ((struct linetable *) 0x561146474c80):
 INDEX  LINE   ADDRESS            IS-STMT PROLOGUE-END 
 0      2      0x0000555555555129 Y                    
 1      3      0x0000555555555134 Y                    
 2      6      0x0000555555555137 Y                    
 3      7      0x0000555555555142 Y                    
 4      10     0x0000555555555145 Y                    
 5      11     0x0000555555555158 Y                    
 6      12     0x0000555555555171 Y                    
 7      END    0x0000555555555173 Y                    

 (gdb) 

... and in llvm-dwarfdump:

 Address            Line   Column File   ISA Discriminator Flags
 ------------------ ------ ------ ------ --- ------------- -------------
 0x0000000000001129      2      0      1   0             0  is_stmt
 0x0000000000001134      3      0      1   0             0  is_stmt
 0x0000000000001137      6      0      1   0             0  is_stmt
 0x0000000000001142      7      0      1   0             0  is_stmt
 0x0000000000001145     10      0      1   0             0  is_stmt
 0x0000000000001158     11      0      1   0             0  is_stmt
 0x0000000000001171     12      0      1   0             0  is_stmt
 0x0000000000001173     12      0      1   0             0  is_stmt end_sequence

Bingo.  With no column info, only one entry for line 11.

Pedro Alves

>> While with your change, reverse-finish in func2 will make gdb stop at
>> the beginning
>> of the line, before the call to func1.
>>
>> Thus I'm really not convinced (yet) this change is a good one.  It
>> removes
>> a user choice.  You can always do reverse-next/step do get what you
>> want
>> reverse-finish do to, already.
> 
>                                     Carl 
>