Bug 4633 - backtracing broken
Summary: backtracing broken
Status: RESOLVED FIXED
Alias: None
Product: systemtap
Classification: Unclassified
Component: runtime (show other bugs)
Version: unspecified
: P2 normal
Target Milestone: ---
Assignee: Masami Hiramatsu
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2007-06-12 19:57 UTC by Frank Ch. Eigler
Modified: 2007-09-12 16:50 UTC (History)
0 users

See Also:
Host: x86-64
Target:
Build:
Last reconfirmed:


Attachments
x86-64 backtracing fix patch (267 bytes, patch)
2007-08-20 15:34 UTC, Masami Hiramatsu
Details | Diff
enhance backtrace test (1.24 KB, patch)
2007-09-04 21:14 UTC, Masami Hiramatsu
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Frank Ch. Eigler 2007-06-12 19:57:08 UTC
Bug #3050 may have been closed but the bug did not stay dead.  The same code on
current fc7 kernels gives the usual single line of backtrace info.
The kernel backtracer always seems to do a better job than the code in the runtime.

There are several problems with the code.  It uses unprotected dereference code
like "*stack++", even though the stack values are not completely reliable.  It
does not know how to distinguish between alternative stacks such as the trap stack,
the normal kernel stack, or whatever happens to come in pt_regs.  This is key
because backtrace() should from both kprobes and from ordinary hook calls
such as timers, begin/end, and markers.  The backtrace() function should not
include the "Inexact backtrace:" string, as this breaks subsequent tokenizing 
with print_stack().
Comment 1 Frank Ch. Eigler 2007-06-12 20:56:07 UTC
Further information... an analogous problem exists even on i386.
Here, a stack traceback from a kprobe includes a lot of the kprobes
invocation path, but none actually from (above) the context of the
int3 itself.
Comment 2 Masami Hiramatsu 2007-08-20 15:34:54 UTC
Created attachment 1972 [details]
x86-64 backtracing fix patch

This patch fixes this bug.
AFAIK, the value (not the address) of rsp is specifying the original stack
address on x86-64.
Comment 3 Frank Ch. Eigler 2007-08-20 19:50:39 UTC
> This patch fixes this bug.
> AFAIK, the value (not the address) of rsp is specifying the original stack
> address on x86-64.

Unfortunately, it's not so easy.  Sometimes (kprobes versus other
event sources?) the &REG_SP value is more correct.  We lack a convincing
set of test cases either way.  Would you mind collecting a set?
Comment 4 Masami Hiramatsu 2007-08-20 20:30:49 UTC
(In reply to comment #3)
> > This patch fixes this bug.
> > AFAIK, the value (not the address) of rsp is specifying the original stack
> > address on x86-64.
> 
> Unfortunately, it's not so easy.  Sometimes (kprobes versus other
> event sources?) the &REG_SP value is more correct.  We lack a convincing
> set of test cases either way.  Would you mind collecting a set?

Could you tell me the actual example which the &REG_SP value is more correct
than REG_SP?
And what sources can the systemtap use?
I just know kprobe/kretprobe/timer/marker/profile.
Comment 5 Masami Hiramatsu 2007-09-04 21:14:27 UTC
Created attachment 1982 [details]
enhance backtrace test

This patch adds test cases of return probe and profile probe to the backtrace
test.
Comment 6 Jim Keniston 2007-09-05 18:52:49 UTC
(In reply to comment #4)
> (In reply to comment #3)
...
> > 
> > Unfortunately, it's not so easy.  Sometimes (kprobes versus other
> > event sources?) the &REG_SP value is more correct.  We lack a convincing
> > set of test cases either way.  Would you mind collecting a set?
> 
> Could you tell me the actual example which the &REG_SP value is more correct
> than REG_SP?

I don't know whether this is what Frank is thinking of, but...

On i386, when you take an int3 trap and you're already in kernel mode, the CPU
doesn't save the esp and ss registers.  So the last two words (esp and xss) of
the pt_regs struct that kprobes passes around actually contain the top two words
of the pre-trap stack.  So the pre-trap top-of-stack is &regs->esp.

On x86_64, the CPU saves rsp and ss even if the trap happens in kernel mode, so
the pre-trap top-of-stack is regs->rsp rather than &regs->rsp.

Comment 7 Frank Ch. Eigler 2007-09-11 22:37:46 UTC
Many thanks - please commit the changes and the tests.
Comment 8 Masami Hiramatsu 2007-09-12 16:50:52 UTC
OK, the patch was committed.