Bug 4285 - improve kretprobes BUG message
Summary: improve kretprobes BUG message
Status: RESOLVED FIXED
Alias: None
Product: systemtap
Classification: Unclassified
Component: kprobes (show other bugs)
Version: unspecified
: P2 normal
Target Milestone: ---
Assignee: Ananth Mavinakayanahalli
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2007-03-27 03:29 UTC by Frank Ch. Eigler
Modified: 2007-05-10 04:50 UTC (History)
1 user (show)

See Also:
Host:
Target:
Build:
Last reconfirmed:


Attachments
Printk details of buggy retprobe (325 bytes, patch)
2007-03-30 13:08 UTC, Ananth Mavinakayanahalli
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Frank Ch. Eigler 2007-03-27 03:29:51 UTC
On a pre-fc7-test vmware image, I sometimes see this kernel BUG:

kernel BUG at arch/i386/kernel/kprobes.c:452!
invalid opcode: 0000 [#1]
SMP 
last sysfs file: /module/uhci_hcd/sections/.text
Modules linked in: stap_ec0f35607d3d65b494254c869904aa15_693632(U) xt_tcpudp ipt
able_nat nf_nat nf_conntrack_ipv4 nf_conntrack nfnetlink ip_tables x_tables hidp
[...]
EFLAGS: 00010046   (2.6.20-1.2986.fc7 #1)
EIP is at trampoline_handler+0xeb/0x137
eax: 00000000   ebx: 00000000   ecx: 00000000   edx: 00000000
esi: de14cac0   edi: c1a31ef8   ebp: c1a31ef0   esp: c1a31ed4
ds: 007b   es: 007b   fs: 00d8  gs: 0000  ss: 0068
Process migration/0 (pid: 2, ti=c1a31000 task=ed34e030 task.ti=c1a31000)
Stack: 00000096 00000000 00000000 00000000 00000000 de14cac0 00000000 cb9dde28 
       c06161fa 00000000 00000033 de14cc70 de14cac0 00000000 cb9dde28 de14cac0 
       0000007b 0000007b 000000d8 ffffffff c06161e4 00000060 00000082 c1a31f94 
Call Trace:
 [<c04061ed>] show_trace_log_lvl+0x1a/0x2f
 [<c040629d>] show_stack_log_lvl+0x9b/0xa3
 [<c040645d>] show_registers+0x1b8/0x289
 [<c040665b>] die+0x12d/0x242
 [<c0615817>] do_trap+0x79/0x91
 [<c0406bfe>] do_invalid_op+0x97/0xa1
 [<c06155ec>] error_code+0x7c/0x84
 [<c06161fa>] kretprobe_trampoline+0x16/0x30

From this information, one cannot tell which kprobe/kretprobe might have been
associated with the problem.  It would be helpful if any such assertion-type
messages also printed out information such as the kprobe* or kretprobe/instance
addresses, as those can be mapped back to systemtap modules (with some effort).
Comment 1 Ananth Mavinakayanahalli 2007-03-27 06:50:13 UTC
This is related to the generic Linux Kernel BUG handling mechanism and as such
is not directly related to Kprobes. While it is a good idea to have information
about which particular kprobe caused the problem, IMO, its not something we can
fix from the kprobes layer.
Comment 2 Maneesh Soni 2007-03-27 12:36:01 UTC
(In reply to comment #1)
> This is related to the generic Linux Kernel BUG handling mechanism and as such
> is not directly related to Kprobes. While it is a good idea to have information
> about which particular kprobe caused the problem, IMO, its not something we can
> fix from the kprobes layer.

Probably Frank just needs a custom printk message before the BUG_ON hit? 

BTW, "esi" has the pointer to kretprobe_instance. I am not sure if just this
info is useful or not.
Comment 3 Ananth Mavinakayanahalli 2007-03-30 13:08:28 UTC
Created attachment 1659 [details]
Printk details of buggy retprobe

Is this what you had in mind Frank? Compile tested only.
Comment 4 Frank Ch. Eigler 2007-03-30 13:55:48 UTC
Yes, that would be helpful.
Comment 5 Jim Keniston 2007-03-30 17:34:10 UTC
As long as you're tweaking that piece of code, you could add a little message --
e.g., "Weird return from kretprobed function; can't find real return address" --
and maybe even refer to that message in Documentation/kprobes.txt, in the
paragraph that starts "If the number of times a function is called does not
match..."

Just a thought.  May be overkill.
Comment 6 Ananth Mavinakayanahalli 2007-04-02 12:08:37 UTC
Patch posted to lkml (Ref: http://marc.info/?l=linux-kernel&m=117550571011439&w=2)

Jim, I just kept the simpler printk message.