Bug 2452

Summary: kretprobe spinlock recursive remove
Product: systemtap Reporter: bibo,mao <bibo.mao>
Component: kprobesAssignee: bibo,mao <bibo.mao>
Status: RESOLVED FIXED    
Severity: normal CC: jkenisto
Priority: P2    
Version: unspecified   
Target Milestone: ---   
Host: Target:
Build: Last reconfirmed:

Description bibo,mao 2006-03-13 06:23:27 UTC
In new linux kernel version, kretprobe in IA32 is implemented in
kretprobe_trampoline. And probepoint is removed from kretprobe_trampoline,
instead   trampoline_handler is called directly.
Currently if kretprobe hander hit one trap which causes another kretprobe, there
will be SPINLOCK recursive bug. This patch fixes this, and will skip trap during
kretprobe handler execution. 
And the test case is the same in
http://sourceware.org/bugzilla/show_bug.cgi?id=2071.

--- 2.6.16-rc6-mm1.org/arch/i386/kernel/kprobes.c	2006-03-13 12:25:15.000000000
+0800
+++ 2.6.16-rc6-mm1/arch/i386/kernel/kprobes.c	2006-03-13 11:38:26.000000000 +0800
@@ -389,9 +389,11 @@ fastcall void *__kprobes trampoline_hand
 			/* another task is sharing our hash bucket */
                         continue;
 
-		if (ri->rp && ri->rp->handler)
+		if (ri->rp && ri->rp->handler){
+			__get_cpu_var(current_kprobe) = &ri->rp->kp;
 			ri->rp->handler(ri, regs);
-
+			__get_cpu_var(current_kprobe) = NULL;
+		}
 		orig_ret_address = (unsigned long)ri->ret_addr;
 		recycle_rp_inst(ri);
Comment 1 Zhang Yanmin 2006-03-13 08:28:07 UTC
The patch looks like a hack and disables kprobe within the kret handler.

A thorough approach is to change kretprobe_inst table as a list per 
task_struct and delete the spin lock in trampoline_handler, then estimate if 
it's a reentrancy for current task. This approach has better scalability and  
also solve this issue.
Comment 2 Zhang Yanmin 2006-03-13 08:49:35 UTC
Another approach is to use a new per cpu data to record current in 
trampoline_handler and compare if it's a reentrancy of current task. It's 
simpler than the first approach.
Comment 3 Zhang Yanmin 2006-03-13 09:10:55 UTC
Sorry, the second approach has a problem. When trampline_handler is entered, 
it must execute the corresponding kret handler.
Comment 4 Jim Keniston 2006-03-16 21:52:49 UTC
My understanding is that Bibo has fixed this.  Please append an update, Bibo. 
Thanks.
Comment 5 bibo,mao 2006-03-21 03:24:47 UTC
Currently if kretprobe handler hit one trap which causes another kretprobe,
there will be SPINLOCK recursive bug. This patch fixes this, and will skip trap
during kretprobe handler execution. And this patch is based on 2.6.16-rc6-mm2.

--- arch/i386/kernel/kprobes.c.bak	2006-03-21 10:35:34.000000000 +0800
+++ arch/i386/kernel/kprobes.c	2006-03-21 10:37:44.000000000 +0800
@@ -390,8 +390,11 @@ fastcall void *__kprobes trampoline_hand
 			/* another task is sharing our hash bucket */
                         continue;
 
-		if (ri->rp && ri->rp->handler)
+		if (ri->rp && ri->rp->handler){
+			__get_cpu_var(current_kprobe) = &ri->rp->kp;
 			ri->rp->handler(ri, regs);
+			__get_cpu_var(current_kprobe) = NULL;
+		}
 
 		orig_ret_address = (unsigned long)ri->ret_addr;
 		recycle_rp_inst(ri);
Comment 6 Ananth Mavinakayanahalli 2006-10-12 11:41:43 UTC
Bibo's fix below is in the upstream kernel... closing.