On a pre-fc7-test vmware image, I sometimes see this kernel BUG: kernel BUG at arch/i386/kernel/kprobes.c:452! invalid opcode: 0000 [#1] SMP last sysfs file: /module/uhci_hcd/sections/.text Modules linked in: stap_ec0f35607d3d65b494254c869904aa15_693632(U) xt_tcpudp ipt able_nat nf_nat nf_conntrack_ipv4 nf_conntrack nfnetlink ip_tables x_tables hidp [...] EFLAGS: 00010046 (2.6.20-1.2986.fc7 #1) EIP is at trampoline_handler+0xeb/0x137 eax: 00000000 ebx: 00000000 ecx: 00000000 edx: 00000000 esi: de14cac0 edi: c1a31ef8 ebp: c1a31ef0 esp: c1a31ed4 ds: 007b es: 007b fs: 00d8 gs: 0000 ss: 0068 Process migration/0 (pid: 2, ti=c1a31000 task=ed34e030 task.ti=c1a31000) Stack: 00000096 00000000 00000000 00000000 00000000 de14cac0 00000000 cb9dde28 c06161fa 00000000 00000033 de14cc70 de14cac0 00000000 cb9dde28 de14cac0 0000007b 0000007b 000000d8 ffffffff c06161e4 00000060 00000082 c1a31f94 Call Trace: [<c04061ed>] show_trace_log_lvl+0x1a/0x2f [<c040629d>] show_stack_log_lvl+0x9b/0xa3 [<c040645d>] show_registers+0x1b8/0x289 [<c040665b>] die+0x12d/0x242 [<c0615817>] do_trap+0x79/0x91 [<c0406bfe>] do_invalid_op+0x97/0xa1 [<c06155ec>] error_code+0x7c/0x84 [<c06161fa>] kretprobe_trampoline+0x16/0x30 From this information, one cannot tell which kprobe/kretprobe might have been associated with the problem. It would be helpful if any such assertion-type messages also printed out information such as the kprobe* or kretprobe/instance addresses, as those can be mapped back to systemtap modules (with some effort).
This is related to the generic Linux Kernel BUG handling mechanism and as such is not directly related to Kprobes. While it is a good idea to have information about which particular kprobe caused the problem, IMO, its not something we can fix from the kprobes layer.
(In reply to comment #1) > This is related to the generic Linux Kernel BUG handling mechanism and as such > is not directly related to Kprobes. While it is a good idea to have information > about which particular kprobe caused the problem, IMO, its not something we can > fix from the kprobes layer. Probably Frank just needs a custom printk message before the BUG_ON hit? BTW, "esi" has the pointer to kretprobe_instance. I am not sure if just this info is useful or not.
Created attachment 1659 [details] Printk details of buggy retprobe Is this what you had in mind Frank? Compile tested only.
Yes, that would be helpful.
As long as you're tweaking that piece of code, you could add a little message -- e.g., "Weird return from kretprobed function; can't find real return address" -- and maybe even refer to that message in Documentation/kprobes.txt, in the paragraph that starts "If the number of times a function is called does not match..." Just a thought. May be overkill.
Patch posted to lkml (Ref: http://marc.info/?l=linux-kernel&m=117550571011439&w=2) Jim, I just kept the simpler printk message.
Patch now in Linus' tree. Ref: http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=0f95b7fc839bc3272b1bf2325d8748a649bd3534