I have this environment systemtap-20070331 snapshot,kernel-2.6.21-rc3, elfutils-0.125 arch=ppc64. But execution of current.stp in systemtap-20070331 snapshot is giving spinlock lockup and softlockup bug. And system is getting freezed. ====================================================== UG: spinlock lockup on CPU#0, sshd/2380, c0000000007da608 Call Trace: [C0000000076134C0] [C00000000000FAB0] .show_stack+0x68/0x1b0 (unreliable) [C000000007613560] [C0000000001AF9E8] ._raw_spin_lock+0x140/0x17c [C000000007613600] [C00000000036B40C] ._spin_lock+0x10/0x24 [C000000007613680] [C00000000006066C] .task_rq_lock+0x68/0xcc [C000000007613720] [C00000000002B71C] .kretprobe_trampoline_holder+0x0/0x8 [C000000007613810] [C00000000002B71C] .kretprobe_trampoline_holder+0x0/0x8 [C0000000076138C0] [C00000000002B71C] .kretprobe_trampoline_holder+0x0/0x8 [C000000007613970] [C00000000002B71C] .kretprobe_trampoline_holder+0x0/0x8 [C000000007613AC0] [C000000000201358] .pty_write+0x74/0x90 [C000000007613B40] [C0000000001FE05C] .write_chan+0x378/0x448 [C000000007613C20] [C0000000001F9EA0] .tty_write+0x1a0/0x278 [C000000007613CF0] [C0000000000E6938] .vfs_write+0x120/0x1f0 [C000000007613D90] [C0000000000E7368] .sys_write+0x4c/0x8c [C000000007613E30] [C0000000000086B4] syscall_exit+0x0/0x40 BUG: spinlock lockup on CPU#6, klogd/1887, c0000000007da608 Call Trace: [C00000000FD173F0] [C00000000000FAB0] .show_stack+0x68/0x1b0 (unreliable) [C00000000FD17490] [C0000000001AF9E8] ._raw_spin_lock+0x140/0x17c [C00000000FD17530] [C00000000036B40C] ._spin_lock+0x10/0x24 [C00000000FD175B0] [C00000000006066C] .task_rq_lock+0x68/0xcc [C00000000FD17650] [C00000000002B71C] .kretprobe_trampoline_holder+0x0/0x8 [C00000000FD17740] [C00000000002B71C] .kretprobe_trampoline_holder+0x0/0x8 [C00000000FD177F0] [C00000000002B71C] .kretprobe_trampoline_holder+0x0/0x8 [C00000000FD178A0] [C00000000002B71C] .kretprobe_trampoline_holder+0x0/0x8 [C00000000FD17930] [C00000000035CF80] .unix_dgram_sendmsg+0x524/0x638 [C00000000FD17A30] [C0000000002C9CE4] .sock_aio_write+0x164/0x19c [C00000000FD17B60] [C0000000000E5FF0] .do_sync_write+0xc4/0x124 [C00000000FD17CF0] [C0000000000E6954] .vfs_write+0x13c/0x1f0 [C00000000FD17D90] [C0000000000E7368] .sys_write+0x4c/0x8c [C00000000FD17E30] [C0000000000086B4] syscall_exit+0x0/0x40 ============================================================= But What I observed is, previous week systemtap is not giving any problem at all. Could the recent transport changes be the reason for this?
This is probably a sideeffect of the inline->function changes committed last week. That is, the script now includes inline functions where it used not to. Chances are that both this test script as well as the probing blacklist need to be changed.
Another data point: I see this kind of crash (with tracebacks showing kretprobe_trampoline) on several platforms, but only with very recent kernels. Contrary to your apparent experience, it does not go away with a slightly older systemtap snapshot. It is as if something has changed for the worse in recent kernels.
no recent similar reports on ppc