Bug 3238 - deadlock when schedule and kfree are both probed by kretprobe
Summary: deadlock when schedule and kfree are both probed by kretprobe
Status: RESOLVED FIXED
Alias: None
Product: systemtap
Classification: Unclassified
Component: kprobes (show other bugs)
Version: unspecified
: P2 normal
Target Milestone: ---
Assignee: bibo,mao
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2006-09-21 08:37 UTC by bibo,mao
Modified: 2006-10-12 11:38 UTC (History)
2 users (show)

See Also:
Host:
Target:
Build:
Last reconfirmed:


Attachments
kretprobe deadlock patch (1.64 KB, patch)
2006-09-21 08:49 UTC, bibo,mao
Details | Diff
kretprobe deadlock test case (943 bytes, application/octet-stream)
2006-09-21 08:54 UTC, bibo,mao
Details
kretprobe spinlock deadlock patch (2.35 KB, application/octet-stream)
2006-09-25 07:40 UTC, bibo,mao
Details
kretprobe spinlock deadlock test case (1.05 KB, application/octet-stream)
2006-09-25 07:44 UTC, bibo,mao
Details
CodingStyle patch against 2.6.18-rc6-mm2 (3.36 KB, patch)
2006-09-25 09:54 UTC, Ananth Mavinakayanahalli
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description bibo,mao 2006-09-21 08:37:01 UTC
If kernel function schedule and kfree both are both probed by kretprobe, there
will be dead lock.
Comment 1 bibo,mao 2006-09-21 08:49:00 UTC
Created attachment 1310 [details]
kretprobe deadlock patch

the patch is tested in IA32 and IA64 architecture, I will test it in x86_64
platform. And I have no powperpc machine, can anyone help to verify it?
Comment 2 bibo,mao 2006-09-21 08:54:14 UTC
Created attachment 1311 [details]
kretprobe deadlock test case

This attachment is kretprobe deadlock test case, this bug can be recreated by
the follwing steps:
  insmod kretprobe_schedule.ko
  insmod kretprobe_kfree.ko
##wait some seconds
  rmmod kretprobe_schedule.ko
## wait some seconds and system will hangup
Comment 3 Ananth Mavinakayanahalli 2006-09-22 06:31:03 UTC
Bibo, I am not able to recreate this problem on powerpc. But a quick look at
your patch suggests this could indeed the reason for a possible deadlock.

If you intend to submit it upstream (I think you should, after testing it on a
couple more archs), please:
- Fix codingstyle issues in the patch (space between ")" and "{")
- Separate out the __kprobes tag for atomic_notifier_call_chain to a separate patch
Comment 4 bibo,mao 2006-09-22 13:42:33 UTC
Thanks for your good suggestions, I will fix the coding style and divide the 
patch into two parts. And Linux -mm tree includes more architecture supports 
for kprobe, I will refresh the patch against -mm tree. 
Comment 5 bibo,mao 2006-09-25 07:40:27 UTC
Created attachment 1321 [details]
kretprobe spinlock deadlock patch

The attachment is new patch for kretprobe spinlock deadlock, this patch is
against 2.6.18-mm1 tree, and as ananth's suggestion, this patch is divides into
three parts: coding style cleanup, disallow kprobes_on_notifier_call_chain and
kretprobe spinlock deadlock patch.
Comment 6 bibo,mao 2006-09-25 07:44:51 UTC
Created attachment 1322 [details]
kretprobe spinlock deadlock test case
Comment 7 bibo,mao 2006-09-25 07:47:23 UTC
Ananth, would you like to review my patch and verify it on powerpc64?

thanks
bibo,mao
Comment 8 Ananth Mavinakayanahalli 2006-09-25 09:54:49 UTC
Created attachment 1323 [details]
CodingStyle patch against 2.6.18-rc6-mm2

Thanks Bibo for the patchset. I however have a more extensive CodingStyle
cleanup patch I did sometime back. Patch attached is against 2.6.18-rc6-mm2.
Please consider using this patch (feel free to include my signed-off, as
necessary).

I will try test your patches, only that we are having network trouble here
today. Will update with results later.
Comment 9 Ananth Mavinakayanahalli 2006-09-25 09:55:35 UTC
Jim, Bibo has a fix for the kfree retprobe deadlock. Request you to take a look.

Ananth
Comment 10 Ananth Mavinakayanahalli 2006-09-25 11:30:45 UTC
Bibo,

I have run some basic tests on ppc64 using the patchset - retprobes on
schedule() and kfree() - loading/unloading in different order, etc. Also
verified that atomic_notifier_call_chain() can't be probed.

Ananth
Comment 11 Ananth Mavinakayanahalli 2006-10-12 11:38:33 UTC
Fix from Bibo now in 2.6.19-rc1