[+rfc] Re: [patch v6 00/21] record-btrace: reverse
Metzger, Markus T
markus.t.metzger@intel.com
Thu Nov 28 10:54:00 GMT 2013
> -----Original Message-----
> From: Jan Kratochvil [mailto:jan.kratochvil@redhat.com]
> Sent: Wednesday, November 27, 2013 7:57 PM
> To: Metzger, Markus T
> Cc: gdb-patches@sourceware.org
> Subject: Re: [+rfc] Re: [patch v6 00/21] record-btrace: reverse
>
> On Thu, 07 Nov 2013 16:41:40 +0100, Metzger, Markus T wrote:
> > I hacked a first prototype of this (see below). It passes most tests but
> > results in three fails in the record_goto suite.
> >
> > One thing that it shows, though, is that it only removes the 'mostly
> harmless'
> > hack in the various goto functions shown above.
> >
> > The more serious hacks in record_btrace_start_replaying
> >
> > /* Make sure we're not using any stale registers. */
> > registers_changed_ptid (tp->ptid);
> >
> > /* We just started replaying. The frame id cached for stepping is
> based
> > on unwinding, not on branch tracing. Recompute it. */
> > frame = get_current_frame_nocheck ();
> > insn = btrace_insn_get (replay);
> > sal = find_pc_line (insn->pc, 0);
> > set_step_info (frame, sal);
> >
> > and record_btrace_stop_replaying
> >
> > /* Make sure we're not leaving any stale registers. */
> > registers_changed_ptid (tp->ptid);
> >
> > however, are not removed by this.
>
> In such case it is not finished. These hacks should not be needed.
See below.
> > They are required when reverse-stepping the first time or when
> > stepping past the end of the execution trace.
>
> I have patched what you describe as the problem. But as I do not have a box
> with reliably working BTS so it is difficult for me to say whether it works or
> not. I can look at other problems if you describe them from a reliable box.
Those hacks are not related to "record goto" and are thus also not affected
by the patch to implement "record goto" via wait/resume.
Let me try to describe the problem. It is also exposed by the next.exp test.
Assume we enable btrace and next over a function call. We will end up
right after the call instruction.
(gdb) record btrace
(gdb) n
50 return 0; /* main.3 */
(gdb) record instruction-history -
31 0x0000000000400590 <fun1+0>: push %rbp
32 0x0000000000400591 <fun1+1>: mov %rsp,%rbp
33 0x0000000000400594 <fun1+4>: leaveq
34 0x0000000000400595 <fun1+5>: retq
35 0x000000000040059f <fun2+9>: leaveq
36 0x00000000004005a0 <fun2+10>: retq
37 0x00000000004005af <fun3+14>: leaveq
38 0x00000000004005b0 <fun3+15>: retq
39 0x00000000004005c4 <fun4+19>: leaveq
40 0x00000000004005c5 <fun4+20>: retq
(gdb) disas
Dump of assembler code for function main:
0x00000000004005c6 <+0>: push %rbp
0x00000000004005c7 <+1>: mov %rsp,%rbp
0x00000000004005ca <+4>: callq 0x4005b1 <fun4>
=> 0x00000000004005cf <+9>: mov $0x0,%eax
0x00000000004005d4 <+14>: leaveq
0x00000000004005d5 <+15>: retq
End of assembler dump.
(gdb)
If we now do a reverse-next, we end up inside the function
we were supposed to step over.
(gdb) reverse-next
fun4 () at record_goto.c:44
44 } /* fun4.5 */
(gdb) record instruction-history -
30 0x000000000040059a <fun2+4>: callq 0x400590 <fun1>
31 0x0000000000400590 <fun1+0>: push %rbp
32 0x0000000000400591 <fun1+1>: mov %rsp,%rbp
33 0x0000000000400594 <fun1+4>: leaveq
34 0x0000000000400595 <fun1+5>: retq
35 0x000000000040059f <fun2+9>: leaveq
36 0x00000000004005a0 <fun2+10>: retq
37 0x00000000004005af <fun3+14>: leaveq
38 0x00000000004005b0 <fun3+15>: retq
39 => 0x00000000004005c4 <fun4+19>: leaveq
(gdb) disas
Dump of assembler code for function fun4:
0x00000000004005b1 <+0>: push %rbp
0x00000000004005b2 <+1>: mov %rsp,%rbp
0x00000000004005b5 <+4>: callq 0x400590 <fun1>
0x00000000004005ba <+9>: callq 0x400596 <fun2>
0x00000000004005bf <+14>: callq 0x4005a1 <fun3>
=> 0x00000000004005c4 <+19>: leaveq
0x00000000004005c5 <+20>: retq
End of assembler dump.
(gdb)
The reason is the way how GDB implements next/reverse-next.
We store the frame_id of the current frame and do a single-step.
Then we try to detect stepping into a subroutine by unwinding
the stack frames and comparing the frame_id's with our stored
frame_id.
The stored frame_id has been computed using dwarf2 frame
unwind.
After single-stepping, we're replaying the recorded execution.
The frame_id's are now computed using btrace frame unwind.
Our parent's frame_id does not compare equal to the stored
frame_id. We fail to detect that we just reverse-stepped into
a subroutine.
The s/w record implementation does not suffer from this problem
because it traces data and is hence able to use the dwarf2 frame
unwinder also when replacing.
The way I tried to overcome this is to recompute all frame_id's
when we start replaying. This will cause us to store a btrace
frame_id in the stepping algorithm. Now we are able to detect
that we reverse-stepped into a subroutine.
Do you have a better idea?
Regards,
Markus.
Intel GmbH
Dornacher Strasse 1
85622 Feldkirchen/Muenchen, Deutschland
Sitz der Gesellschaft: Feldkirchen bei Muenchen
Geschaeftsfuehrer: Christian Lamprechter, Hannes Schwaderer, Douglas Lusk
Registergericht: Muenchen HRB 47456
Ust.-IdNr./VAT Registration No.: DE129385895
Citibank Frankfurt a.M. (BLZ 502 109 00) 600119052
More information about the Gdb-patches
mailing list