Panicking the system from systemtap
Problem
Sometimes it's useful to cause the system to panic when a particular event happens. This can be used to obtain a vmcore file via netdump, diskdump or kdump in order to carry out post-mortem debugging using a tool like crash.
Scripts
# Include the header that declares panic() %{ #include <kernel.h> %} # Wrap panic() in stap function panic(msg:string) %{ panic("%s", THIS->msg); %} # Tell the user what we're doing probe begin { printf("panic on OOM enabled\n") } probe end { printf("panic on OOM disabled\n") } # Just probe __oom_kill_task - it's after sysctl etc. checks in oom_kill probe kernel.function("__oom_kill_task") { panic("__oom_kill_task called\n") }
Output
This script must be run with guru mode (-g), since it uses embeded C to access the kernel's panic() routine.
# stap -g panic-on-oom.stp panic on OOM enabled When an OOM kill occurs: oom-killer: gfp_mask=0xd0 Mem-info: [SNIP] 0 bounce buffer pages Free swap: 0kB 523914 pages of RAM 294538 pages of HIGHMEM 5594 reserved pages 264 pages shared 0 pages swap cached Kernel panic - not syncing: __oom_kill_task called ------------[ cut here ]------------ kernel BUG at kernel/panic.c:75! invalid operand: 0000 [#1] SMP Modules linked in: netconsole netdump stap_a48a9d50ed21c03a01970dd07bd4b2f2_392(U) md5 ipv6 parport_pc lp parport autofs4 sunrpc loop dm_multipath usb_storage button battery ac uhci_hcd ehci_hcd hw _random snd_azx snd_hda_codec snd_pcm_oss snd_mixer_oss snd_pcm snd_timer snd soundcore snd_page_alloc tg3 dm_snapshot dm_zero dm_mirror ext3 jbd dm_mod ata_piix libata sd_mod scsi_mod CPU: 1 EIP: 0060:[<c0122106>] Not tainted VLI EFLAGS: 00010086 (2.6.9-55.ELsmp) EIP is at panic+0x47/0x147 eax: 00000043 ebx: f401a200 ecx: f60b8cf0 edx: c02e774b esi: f60b8dc4 edi: f60b8dc4 ebp: c2022120 esp: f60b8cf8 ds: 007b es: 007b ss: 0068 Process sshd (pid: 4265, threadinfo=f60b8000 task=f66ca330) Stack: f401a200 f8aa2596 f8aa332b f401a2b4 f8aa25df f8aa7120 f8aa26da 00000000 00000000 00000000 cc9867cb 00000155 00000096 f401a200 c2022100 f8aa7120 f60b8dc4 c2022120 c011947b f89d3da0 f60b8000 c0143427 00000000 c032ae3c [SNIP]
Lessons
Sometimes it's useful to be able to panic a box when a particular event happens, or some condition becomes true. Post-mortem debugging from a memory image can be a powerful tool to understand a problem but it can be difficult, or require creation of custom kernel patches to trigger a crash at just the right moment. Systemtap allows this functionality to be added on-the-fly. Although this example chose to hook into the OOM killer routines the same basic idea can be adapted to many different problems.