I have two bare-bones C modules: The first places a kprobe, jprobe, & kretprobe on "schedule" function, the second does the same to "do_IRQ." The handlers all simply increment counters (no printk's). Both modules can be used independently of one another without problems however running both modules together causes the system to eventually hang. The results are sporadic, sometimes significant execution time is required to cause a problem, other times the system will freeze immediately.
Created attachment 606 [details] this is the module that monitors calls kernel calls to schedule
Created attachment 607 [details] this is the module that monitors calls kernel calls to do_IRQ
Created attachment 608 [details] this Makefile will build them...
Here is some information about the machine I used to test these modules: Linux elm3b90 2.6.9-15.EL #1 SMP Fri Aug 5 18:57:49 EDT 2005 ppc64 ppc64 ppc64 GNU/Linux
Here is the xmon log from one recreate: cpu 0x1: Vector: 700 (Program Check) at [c0000000009d73e0] pc: c000000000013c7c: .do_IRQ+0x0/0x100 lr: c00000000000ae34: HardwareInterrupt_entry+0x8/0x54 sp: c0000000009d7660 msr: 8000000000029032 current = 0xc0000000146468b0 paca = 0xc0000000003eec00 pid = 0, comm = swapper enter ? for help 1:mon> t [link register ] c00000000000ae34 HardwareInterrupt_entry+0x8/0x54 [c0000000009d7660] c000000000047dac .kprobe_exceptions_notify+0x3bc/0x4b0 (unreliable) --- Exception: 501 (Hardware Interrupt) at c00000000008da24 .get_kprobe+0x58/0x7c [link register ] c000000000047b88 .kprobe_exceptions_notify+0x198/0x4b0 [c0000000009d7950] c000000000047b7c .kprobe_exceptions_notify+0x18c/0x4b0 (unreliable) [c0000000009d79f0] c000000000070984 .notifier_call_chain+0x68/0x98 [c0000000009d7a80] c0000000000127b0 .ProgramCheckException+0x80/0x17c [c0000000009d7b20] c00000000000b048 ProgramCheck_common+0xc8/0x100 --- Exception: 700 (Program Check) at c0000000002f588c .schedule+0x0/0xaa8 [c0000000009d7e10] c00000000001479c .shared_idle+0xa8/0xe8 (unreliable) [c0000000009d7e90] c000000000014854 .cpu_idle+0x38/0x50 [c0000000009d7f00] c00000000003f1fc .start_secondary+0x15c/0x178 [c0000000009d7f90] c00000000000bd84 .enable_64b_mode+0x0/0x28 1:mon> r R00 = 0000000000000108 R16 = 0000000000000000 R01 = c0000000009d7660 R17 = 0000000000000000 R02 = c0000000004e8550 R18 = 0000000000000000 R03 = c0000000009d76d0 R19 = 0000000000000000 R04 = 0000000000000004 R20 = 0000000000000000 R05 = c0000000009d7af0 R21 = 0000000000000000 R06 = 0000000000000006 R22 = 0000000000000000 R07 = 0000000000000000 R23 = 0000000000000001 R08 = 0000000000000014 R24 = 0000000000000001 R09 = 0000000000000501 R25 = 0000000000000000 R10 = 0000000000000000 R26 = 0000000000000000 R11 = 7265677368657265 R27 = 0000000000000000 R12 = 8000000000009032 R28 = 0000000000000000 R13 = c0000000003eec00 R29 = c0000000009d7b90 R14 = 0000000000000000 R30 = c00000000043a420 R15 = 0000000007a8dcb0 R31 = c0000000002f588c pc = c000000000013c7c .do_IRQ+0x0/0x100 lr = c00000000000ae34 HardwareInterrupt_entry+0x8/0x54 msr = 8000000000029032 cr = 28000088 ctr = c0000000000479f0 xer = 0000000000000004 trap = 700 1:mon> System is a 2proc Power5 LPAR. Problem occurs only if _both_ the schedule() and do_IRQ() probes are active at the same time. At this time, I am not able to explain why we got into xmon even though we have a kprobe notifier before the xmon call. Investigating....
This problem lives in x86_64 as well. ------------------------------------------------------------------------------- [krstaffo@elm3b30 ~]$ cat /proc/cpuinfo processor : 0 vendor_id : AuthenticAMD cpu family : 15 model : 5 model name : Opteron MP w/ 1MB stepping : 0 cpu MHz : 1597.929 cache size : 1024 KB fpu : yes fpu_exception : yes cpuid level : 1 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 syscall nx mmxext lm 3dnowext 3dnow bogomips : 3145.72 TLB size : 1088 4K pages clflush size : 64 cache_alignment : 64 address sizes : 40 bits physical, 48 bits virtual power management: ts ttp processor : 1 vendor_id : AuthenticAMD cpu family : 15 model : 5 model name : Opteron MP w/ 1MB stepping : 0 cpu MHz : 1597.929 cache size : 1024 KB fpu : yes fpu_exception : yes cpuid level : 1 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 syscall nx mmxext lm 3dnowext 3dnow bogomips : 3186.68 TLB size : 1088 4K pages clflush size : 64 cache_alignment : 64 address sizes : 40 bits physical, 48 bits virtual power management: ts ttp ------------------------------------------------------------------------------- I am unfamiliar with xmon debugging stuff.
The crash is due to a hardware interrupt occuring between the time the pre_handler for the schedule probe is run (kprobe_status = KPROBE_HIT_SS) and the time the original instruction is single-stepped (kprobe_status = KPROBE_HIT_SSDONE). In this case, we ignore the do_IRQ() probe due to the hardware interrupt - snippet from kprobes_handler: p = get_kprobe(addr); if (p) { if (kprobe_status == KPROBE_HIT_SS) { regs->msr &= ~MSR_SE; regs->msr |= kprobe_saved_msr; unlock_kprobes(); goto no_kprobe; } Looks like the kprobes code is not geared to handle asynchronous interrupts and can possibly be triggered on any arch that doesn't disable local irqs during the processing of the probe.
I get the following xmon trace when the above modules are inserted on power4 LPAR. Looks like the hardware interrupt occured while within retprobe_trampoline(). llm16.in.ibm.com login: cpu 0x3: Vector: 700 (Program Check) at [c000000004293880] pc: c00000000000f57c: .do_IRQ+0x0/0xe4 lr: c00000000000ae34: hardware_interrupt_entry+0x8/0x54 sp: c000000004293b00 msr: 9000000000029032 current = 0xc00000000ffc0030 paca = 0xc0000000005d0400 pid = 0, comm = swapper enter ? for help 3:mon> t [link register ] c00000000000ae34 hardware_interrupt_entry+0x8/0x54 [c000000004293b00] c00000000ffc0330 (unreliable) --- Exception: 501 (Hardware Interrupt) at d000000000012020 [c000000004293df0] c00000000003a3bc kretprobe_trampoline+0x0/0x8 (unreliable) [c000000004293e90] c00000000000fa7c .cpu_idle+0x4c/0x60 [c000000004293f00] c000000000031890 .start_secondary+0x108/0x144 [c000000004293f90] c00000000000bd28 .enable_64b_mode+0x0/0x28 3:mon> r R00 = 0000000000000001 R16 = 0000000000000000 R01 = c000000004293b00 R17 = 0000000000000000 R02 = c000000000727150 R18 = 0000000000000000 R03 = c000000004293b70 R19 = 0000000000000000 R04 = c00000000ffc0330 R20 = 0000000000000000 R05 = c0000000fc323ea0 R21 = 0000000000c00000 R06 = 0000000022ff9f82 R22 = 0000000000000000 R07 = c00000000001133c R23 = 0000000000000001 R08 = c000000004290080 R24 = 0000000000000003 R09 = 0000000000000501 R25 = 0000000000000190 R10 = 0000000000000000 R26 = 0000000000000568 R11 = 7265677368657265 R27 = 0000000000000010 R12 = 9000000000009432 R28 = 0000000000000008 R13 = c0000000005d0400 R29 = c000000004290000 R14 = 0000000000000000 R30 = c000000004290000 R15 = 0000000000000000 R31 = c000000004290080 pc = c00000000000f57c .do_IRQ+0x0/0xe4 lr = c00000000000ae34 hardware_interrupt_entry+0x8/0x54 msr = 9000000000029032 cr = 28ff9f88 ctr = c0000000000c23e4 xer = 000000000000ff7f trap = 700
Reproducible on IA64 too and the problems happens _only_ when both the modules are loaded. On IA64 the system freezes within few seconds of second modules getting loaded. Preliminary debug showed that all CPU's are stuck in ia64_spinlock_contention. Test configuration was Tiger4 with 4 CPU's running RHEL4 U2 Beta kernel.
My understanding is that we run exception handlers with h/w interrupts enabled and hence this problem. However, I am not sure if we can have a "completely sane" solution here as there will still be a window (switching from task to exception context) when a hardware interrupt can occur, where things can (and probably will) go haywire. Asynchronous non-software events are the culprit :-)
(In reply to comment #10) > My understanding is that we run exception handlers with h/w interrupts enabled > and hence this problem. Yup, I agree with you. And with the current kprobe design, I guess it is very dangerous to insert probes on both ISR routines and non-ISR routines together. However having probes on just one type is fine. Here is what happens if we have kprobes on both ISR and non-ISR routines together. 1) Non-ISR routine probe gets hit and we take the lock_kprobes() 2) And before we release the lock taken above, hardware ISR fires and we enter the exception handler(since we have probe on ISR routine) and here the cpu can never get the lock and hence we have a dead lock( which is what I am seeing on IA64). >However, I am not sure if we can have a "completely > sane" solution here as there will still be a window (switching from task to > exception context) when a hardware interrupt can occur, where things can (and > probably will) go haywire. Asynchronous non-software events are the culprit :-) We need to redesing kprobes slightly where in we do not hold the lock_kprobes () during exception handling. I guess it can be done. Thoughts??
Well Anil, its not a deadlock and its not an issue with lock_kprobes(). The problem is the h/w interrupt occurs in the window between the time you set KPROBE_HIT_SS in the pre_handler and the time you actually single-step. The case you describe is fine since the hardware interrupt entry isn't through the exception handlers (I presume its the same in ia64) and hence you will spin till you get the lock. I can say this with certainity 'cos, on ppc64 atleast, we have a kernel debugger (xmon) shipped with upstream kernels. What I am seeing is that we enter the debugger (whose hook is after the kprobes call) due to a legitimate breakpoint (kprobe on do_IRQ()) but it occurs in the window I described above, during which time, we don't _expect_ any other kprobe hits. Its an atomic software only sequence (set trap/step flag and single step once) which is disrupted by the occurance of a hardware interrupt, which in turn causes another trap. In my opinion, this issue needs handling at a higher (more generic) level than kprobes => we'll need to figure a way to safely disable interrupts before _any_ exception code is run (arch/xxx/kernel/traps.c). That is easier said than done :) In any case, lock_kprobes() is going away with the locking change patches. I am able to reproduce the problem even without lock_kprobes()
Created attachment 624 [details] possible fix for issue with probing both do_irq and schedule Ok, here is a possible solution to the issue. I can't think of a bug that could be introduced by this change, but Prasanna will know better. However, I have been able to run the two probes (schedule and do_irq) simultaneously with this patch applied. Prasanna, Anil, Jim, what are your thoughts?
(In reply to comment #12) > Well Anil, its not a deadlock and its not an issue with lock_kprobes(). The > problem is the h/w interrupt occurs in the window between the time you set > KPROBE_HIT_SS in the pre_handler and the time you actually single-step. Ananth, The issues at least I am seeing is a deadlock due to lock_kprobes() and h/w interrupt. Say we hit a probe on non-ISR routione sys_mkdir() and in the execption handler we take a lock_kprobe() and before we get to release this lock, say a H/W interrupt occurs on this very same CPU where you have taken a lock. As a result of H/W ISR the ISR routine(say do_ISR() ) get executed. Suppose if we have a probe on do_ISR() routine, then we enter exception handler (2nd time, recursively) where in we again try to take the same lock_kprobes() and spin here for ever since the lock is already take by this very same CPU. > In my opinion, this issue needs handling at a higher (more generic) level than > kprobes => we'll need to figure a way to safely disable interrupts before _any_ > exception code is run (arch/xxx/kernel/traps.c). That is easier said than done :) I don;t disagree with you. This is the the ideal fix for this bug:-) > In any case, lock_kprobes() is going away with the locking change patches. I am > able to reproduce the problem even without lock_kprobes() True, this is because when we recurse due to h/w interrupt occuring while we are in the middle of kprobes exception we might over write the per cpu info's.
Ananth, > if (kprobe_status == KPROBE_HIT_SS) { > regs->msr &= ~MSR_SE; > regs->msr |= kprobe_saved_msr; > unlock_kprobes(); > goto no_kprobe; > } AFAIK the above code was added to allow probes on "int3/trap" instructions and allow them to single step __inline___. As a result of single stepping original instruction i.e, "int3/trap" the control was given back to other's registered in the exception notify list. More over in ppc64, kprobes handlers are executed without disabling local interrupts...Can't we execute the kprobe handler's with local interrupts disabled? Are there any issues involved in executing the kprobe handlers with local interrupts disabled? I have tested your patch on ppc64 LPAR box, the system seems to hang with your patch on 2.6.13-rc7 kernel. Below is the trace after I do a soft reset on the hung system. cpu 0x0: Vector: 100 (System Reset) at [c0000000005afac0] pc: c00000000000f8d8: .default_idle+0x88/0x154 lr: c00000000003a3bc: kretprobe_trampoline+0x0/0x8 sp: c0000000005afd40 msr: 9000000000009032 current = 0xc0000000005e3ea0 paca = 0xc0000000005cec00 pid = 0, comm = swapper cenntteerr ?? ff0V0ec (toSrys:t 10:00mo n(S> ystem Reset) at [c00000000428fb70] pc: c00000000000f8f4: .default_idle+0xa4/0x154 lr: c00000000003a3bc: kretprobe_trampoline+0x0/0x8 sp: c00000000428fdf0 msr: 9000000000009032 current = 0xc00000000ffc07a0 paca = 0xc0000000005cfc00 pid = 0, comm = swapper cpu 0x3: Vector: 100 (System Reset) at [c000000004293b70] pc: c00000000000f8ec: .default_idle+0x9c/0x154 lr: c00000000003a3bc: kretprobe_trampoline+0x0/0x8 sp: c000000004293df0 msr: 9000000000009032 current = 0xc00000000ffc0030 paca = 0xc0000000005d0400 pid = 0, comm = swapper cpu 0x1: Vector: 100 (System Reset) at [c00000000428bb70] pc: c00000000000f8f4: .default_idle+0xa4/0x154 lr: c00000000003a3bc: kretprobe_trampoline+0x0/0x8 sp: c00000000428bdf0 msr: 9000000000009032 current = 0xc00000000ffc4030 paca = 0xc0000000005cf400 pid = 0, comm = swapper 0:mon> t [c0000000005afd40] c00000000003a3bc kretprobe_trampoline+0x0/0x8 (unreliable) [c0000000005afde0] c00000000000fa7c .cpu_idle+0x4c/0x60 [c0000000005afe50] c00000000000c03c .rest_init+0x3c/0x54 [c0000000005afed0] c000000000555888 .start_kernel+0x294/0x310 [c0000000005aff90] c00000000000bfb4 .__setup_cpu_power3+0x0/0x4 0:mon> r R00 = 0000000000000010 R16 = 0000000000000000 R01 = c0000000005afd40 R17 = 0000000000000000 R02 = c000000000727150 R18 = 0000000003a10000 R03 = c00000010381b240 R19 = 000000000291fa74 R04 = c0000000005e41a0 R20 = 0000000003f90340 R05 = 0000000000000004 R21 = 0000000000000038 R06 = 0000000022000082 R22 = bffffffffc5f0000 R07 = c00000000001133c R23 = 0000000200000000 R08 = c0000000005ac080 R24 = c0000000005cec00 R09 = 0000000000000000 R25 = c000000000725208 R10 = c0000000005ac000 R26 = c0000000005a8b88 R11 = c0000000007288e8 R27 = 0000000000000010 R12 = 0000000048000028 R28 = 0000000000000008 R13 = c0000000005cec00 R29 = c0000000005ac000 R14 = 0000000000000000 R30 = c0000000005ac000 R15 = 0000000000000000 R31 = c0000000005ac080 pc = c00000000000f8d8 .default_idle+0x88/0x154 lr = c00000000003a3bc kretprobe_trampoline+0x0/0x8 msr = 9000000000009032 cr = 22000088 ctr = c0000000002257fc xer = 000000002000ff7f trap = 100 0:mon> Thanks Prasanna
Ok, I now see the issue with the patch. However, wouldn't it be better to have a tighter if () condition than just the kprobe_status check? Something along the lines of if ((kprobe_status == KPROBE_HIT_SS) && (p->opcode == BREAKPOINT_INSTRUCTION)) .... Also, except for i386, all other archs run the exception handlers with interrupts enabled. So, this isn't an issue just with ppc64. I believe its too big a penalty to disable interrupts just to handle the unique case such as the one that is causing the current problem. More than anything, I think it is a performance hit to disable interrupts. (I could see nmissed = 13 with the testcase running just for about 15 seconds). Also, I am not sure if upstream maintainers would be ok with such a major change, given that we see this problem on ia64 and x86_64 too. I will now try reproduce the problem with retprobes - the xmon trace is clearly due to system reset - it doesn't look like anything related to the current problem. We should try and resolve this issue within the kprobes framework as far as possible as the overhead on the system (disabling interrupts during exception handling) would be unnecessarily high even during normal operations.
if ((kprobe_status == KPROBE_HIT_SS) && (p->opcode == BREAKPOINT_INSTRUCTION)) This look ok, but still the system hang does not disappear.. on power 4 Lpar In i386 architecture interrupts are disabled before you enter kprobe_handler().. I am not sure if disabling the interrupts on other architectures will be a big issues, if it really solves some major problem. Please check that... I tried disabling the interrupt on ppc64 which seem to work ..offcourse with a panelty of missing few probes during that period... Thanks Prasanna
I applied Ananth's x86_64 kprobes patch to kernel 2.6.13-rc5. It survived a kernel build with the schedule and do_IRQ calls instrumented. Previously this scenario caused a system crash.
(In reply to comment #18) > I applied Ananth's x86_64 kprobes patch to kernel 2.6.13-rc5. It survived a > kernel build with the schedule and do_IRQ calls instrumented. Previously this > scenario caused a system crash. Ananth's patch has not solved the system freeze completly. I am still seeing the system freeze with Ananth's patch on IA64. After much more deep debug, here is what I have found that is causing the deadlock due to Hardware interrupt and the the way we are trying to hold the lock. The below code flow assumes that a probe has been registered on schedule() and on do_IRQ(); ---code flow snip------ schedule() probe_hit kprobe_exceptions_notify() pre_kprobes_handler() lock_kprobes() { spin_lock(&kprobe_lock); >>>>>>>>LOCAL INTERRUPT HAPPENS HERE >>>>>> Processor starts executing do_IRQ kprobe_cpu = smp_processor_id(); } do_IRQ() probe_hit kprobe_exceptions_notify() pre_kprobes_handler() kprobe_running() { return kprobe_cpu == smp_processor_id(); /* Returns FALSE here since LOCAL INTERRUPT happened before kprobe_cpu got set */ } lock_kprobes() { spin_lock(&kprobe_lock); /****** DEADLOCK HERE *******/ kprobe_cpu = smp_processor_id(); } ---code flow snip end------ Since kprobe_running() failed, we try to take the lock again and end up in deadlock. The only way to address this deadlock is to change the lock to irqsave lock, but this will cause a huge performance penalty and since U2 is so close to code freeze, I suggest we document this as know issues where in say we can not insert probes on ISR routines. Commments?
As Anil has now characterized, there are multiple ways this problem can manifest itself, I see a trap on ppc64 while its a hang on ia64. I tend agree with Anil that its a little late in the game to introduce the irqsave approach. The changes required to share the irq flags across handlers in itself is quite involved. Given that this issue will be moot when the per_cpu changes are introduced, I vote for documenting this "feature" - probing irq handlers in conjunction with task state routines is currently vulnarable.
Created attachment 628 [details] kpobe_cpu is updated while interrupts disabled Made a quick patch, where in kprobe_cpu is now updated while interrupts being disabled. This patch seems to work on IA64. Can you try this on PPC64. thanks, -Anil
Created attachment 629 [details] another attempt to fix irq-sched issue Here is the updated patch to handler the irq-sched issue as I see on ppc64. This, in conjunction with Anil's fix above allows my ppc64 lpar to run without any issues. Anil, request you make the changes for ia64 as appropriate. Ananth
Hello, I applied fix_1229.patch and x86_64 kp-sched-irq.patch to kernel 2.6.13-rc6. The machine survived 3 (concurrent) kernel builds with do_IRQ and schedule instrumented. Previously this scenario would freeze the system. [root@elm3b30 kernels] uname -a Linux elm3b30 2.6.13-rc6 #1 SMP Mon Aug 29 15:07:42 PDT 2005 x86_64 x86_64 x86_64 GNU/Linux [root@elm3b30 kernels]# cat /etc/*-release Red Hat Enterprise Linux ES release 4 (Nahant Update 2)
Combination of Anil's patch and Ananth's patch fixes the problem on my power4 lpar box. -Prasanna
Ananth, Anil, The kernel crashed after running for almost 2hrs, the trace shows that hardware interrupt occured while in the kretprobe_trampoline() even after using both the fixes on power4 lpar box. Thanks Prasanna cpu 0x0: Vector: 700 (Program Check) at [c00000000474f510] pc: c00000000000f57c: .do_IRQ+0x0/0xe4 lr: c00000000000ae34: hardware_interrupt_entry+0x8/0x54 sp: c00000000474f790 msr: 9000000000029032 current = 0xc0000000ff97d7a0 paca = 0xc0000000005cec00 pid = 3694, comm = nifd enter ? for help 0:mon> bt type address 0:mon> t [link register ] c00000000000ae34 hardware_interrupt_entry+0x8/0x54 [c00000000474f790] 9000000000009032 (unreliable) --- Exception: 501 (Hardware Interrupt) at d000000000012004 [c00000000474fa80] c00000000003a3bc kretprobe_trampoline+0x0/0x8 (unreliable) [c00000000474fb60] c0000000000c27f0 .do_select+0x278/0x4c4 [c00000000474fcb0] c0000000000e1344 .compat_sys_select+0x390/0x690 [c00000000474fdc0] c000000000019770 .ppc32_select+0x14/0x28 [c00000000474fe30] c00000000000d600 syscall_exit+0x0/0x18 --- Exception: c01 (System Call) at 000000000fe29470 SP (ffd5b910) is in userspace 0:mon> r R00 = c0000000004a37c4 R16 = 0000000000000005 R01 = c00000000474f790 R17 = c0000000046262e0 R02 = c000000000727150 R18 = c0000000046262e8 R03 = c00000000474f800 R19 = c0000000046262f0 R04 = 9000000000009032 R20 = 0000000000000000 R05 = c0000000000c27f0 R21 = c0000000046262c8 R06 = 0000000000000000 R22 = c0000000046262d0 R07 = c00000010381c580 R23 = c0000000046262d8 R08 = 0000000100691cb7 R24 = 0000000000000000 R09 = 0000000000000501 R25 = 0000000000000010 R10 = 0000000000000000 R26 = 0000000000000000 R11 = 7265677368657265 R27 = c00000000474faf0 R12 = 9000000000009432 R28 = c0000000005b4000 R13 = c0000000005cec00 R29 = 000000010069209e R14 = 0000000000000000 R30 = c00000000060aa90 R15 = 0000000000000010 R31 = 00000000000003e8 pc = c00000000000f57c .do_IRQ+0x0/0xe4 lr = c00000000000ae34 hardware_interrupt_entry+0x8/0x54 msr = 9000000000029032 cr = 28444428 ctr = c0000000000c243c xer = 0000000020000000 trap = 700 0:mon>
Hmm, I left the system to run overnight tests in a tight loop and it was still continuing without failures when I came in today.
Would a call to in_sched_function() come in handy someplace?
Nope. Theoritically, you could trigger this bug with a probe on do_IRQ in conjunction with a probe on any other task state function. Its totally dependent on when the hardware interrupt occurs.
Prasanna, What is the testcase you are using to trigger the issue on a power4? Are they just the mon_* files or something more? Ananth
Created attachment 630 [details] changing preempt_enable to preempt_enable_no_resched Prasanna, On you Power4 box, can you apply yet another patch along with the other *two* patches to see if crash you are seeing goes away. I hope the attached patch compiles for ppc64 as I have not tested the attached patch. Thanks, Anil
Anil, Yes, using preempt_enable_no_resched() instead of preempt_enable() avoids the system crash. My Power4 lpar system has been running since 4hrs. Thanks Prasanna
Created attachment 631 [details] use preempt_enable_no_resched() instead of preempt_enable() on ppc64 Anil, Just took a look at your patch. We introduced the preempt_(dis/en)able in kprobe_exceptions_notify so we avoid calling smp_processor_id() with preemption on, in which case, we end up getting a flood of warnings if the kernel is compiled with DEBUG_SPINLOCK and DEBUG_SPINLOCK_SLEEP. So, here is a patch that addresses the issue with less changes, while also taking care of the above case. Prasanna, Request you to please test with this patch instead. Ananth
I patched 2.6.13-rc6 with fix_1229.patch & kp-sched-irq.patch (not kp-ppc-preempt-noresched.patch) last night. These changes ran on elm3b90 SMP ppc64 machine all night without a problem.
(In reply to comment #32) > Created an attachment (id=631) > use preempt_enable_no_resched() instead of preempt_enable() on ppc64 This simple changes should be fine for now though we endup disabling and enabling preempt one extra time for kprobes which not a big of a deal. By the way IA64 is happy too and I have not seen System feeze even under heavy make -j's and ./please_crash_me going on for past 16 hours, of course I have loaded both mon_sched.ko and mon_irq.ko :-) -Anil
As Ananth requested: I applied the 3 patches to SMP ppc64 running 2.6.13-rc6. I loaded both mon_irq.ko and mon_sched.ko, and then executed a script which does make -j16 / make clean in tight loop on a kernel. This was 20 hours (90 builds) ago, and the machine is still alive. I think this one is looking good.
Created attachment 643 [details] Mother of all patches to fix irq-sched issue on RHEL4 U2 Patch against RHEL4 U2 beta to fix #1229 on i386, x86_64, ppc64 and IA64. This patch includes all the previous patches and applies cleanly on RHEL4 U2 kernel. A patch against latest kernel 2.6.13-mm1 has been posted onto lkml and has been queued into mm. Here is the posting to lkml http://lkml.org/lkml/2005/9/1/295
Created attachment 644 [details] Final - Mother of all patches to fix irq-sched issue on RHEL4 U2 Now inclues comments around lock_kprobes() and unlock_kprobes() and again this patch is against RHEL4 U2 beta to fix #1229 on i386, x86_64, ppc64 and IA64. This patch includes all the previous patches and applies cleanly on RHEL4 U2 kernel. A patch against latest kernel 2.6.13-mm1 has been posted onto lkml http://lkml.org/lkml/2005/9/1/295 http://lkml.org/lkml/2005/9/1/380 Please give your final spin on this patch before we ask Red Hat to include this patch in their upcoming RHEL 4 U2 release.
Patch Anil attached above is in -mm. Closing this bug