If kernel function schedule and kfree both are both probed by kretprobe, there will be dead lock.
Created attachment 1310 [details] kretprobe deadlock patch the patch is tested in IA32 and IA64 architecture, I will test it in x86_64 platform. And I have no powperpc machine, can anyone help to verify it?
Created attachment 1311 [details] kretprobe deadlock test case This attachment is kretprobe deadlock test case, this bug can be recreated by the follwing steps: insmod kretprobe_schedule.ko insmod kretprobe_kfree.ko ##wait some seconds rmmod kretprobe_schedule.ko ## wait some seconds and system will hangup
Bibo, I am not able to recreate this problem on powerpc. But a quick look at your patch suggests this could indeed the reason for a possible deadlock. If you intend to submit it upstream (I think you should, after testing it on a couple more archs), please: - Fix codingstyle issues in the patch (space between ")" and "{") - Separate out the __kprobes tag for atomic_notifier_call_chain to a separate patch
Thanks for your good suggestions, I will fix the coding style and divide the patch into two parts. And Linux -mm tree includes more architecture supports for kprobe, I will refresh the patch against -mm tree.
Created attachment 1321 [details] kretprobe spinlock deadlock patch The attachment is new patch for kretprobe spinlock deadlock, this patch is against 2.6.18-mm1 tree, and as ananth's suggestion, this patch is divides into three parts: coding style cleanup, disallow kprobes_on_notifier_call_chain and kretprobe spinlock deadlock patch.
Created attachment 1322 [details] kretprobe spinlock deadlock test case
Ananth, would you like to review my patch and verify it on powerpc64? thanks bibo,mao
Created attachment 1323 [details] CodingStyle patch against 2.6.18-rc6-mm2 Thanks Bibo for the patchset. I however have a more extensive CodingStyle cleanup patch I did sometime back. Patch attached is against 2.6.18-rc6-mm2. Please consider using this patch (feel free to include my signed-off, as necessary). I will try test your patches, only that we are having network trouble here today. Will update with results later.
Jim, Bibo has a fix for the kfree retprobe deadlock. Request you to take a look. Ananth
Bibo, I have run some basic tests on ppc64 using the patchset - retprobes on schedule() and kfree() - loading/unloading in different order, etc. Also verified that atomic_notifier_call_chain() can't be probed. Ananth
Fix from Bibo now in 2.6.19-rc1