Differences between revisions 3 and 4
Revision 3 as of 2008-03-19 17:46:21
Size: 3516
Editor: BrynReeves
Revision 4 as of 2008-03-19 17:47:15
Size: 3492
Editor: BrynReeves
Deletions are marked like this. Additions are marked like this.
Line 35: Line 35:
        panic("__oom_kill_task called - panicking\n")         panic("__oom_kill_task called\n")
Line 57: Line 57:
Kernel panic - not syncing: __oom_kill_task called - panicking Kernel panic - not syncing: __oom_kill_task called

Panicking the system from systemtap


Sometimes it's useful to cause the system to panic when a particular event happens. This can be used to obtain a vmcore file via netdump, diskdump or kdump in order to carry out post-mortem debugging using a tool like crash.


# Include the header that declares panic()
#include <kernel.h>

# Wrap panic() in stap
function panic(msg:string) %{
        panic("%s", THIS->msg);

# Tell the user what we're doing
probe begin {
        printf("panic on OOM enabled\n")

probe end {
        printf("panic on OOM disabled\n")

# Just probe __oom_kill_task - it's after sysctl etc. checks in oom_kill
probe kernel.function("__oom_kill_task") {
        panic("__oom_kill_task called\n")


This script must be run with guru mode (-g), since it uses embeded C to access the kernel's panic() routine.

# stap -g panic-on-oom.stp
panic on OOM enabled

When an OOM kill occurs:
oom-killer: gfp_mask=0xd0
0 bounce buffer pages
Free swap:            0kB
523914 pages of RAM
294538 pages of HIGHMEM
5594 reserved pages
264 pages shared
0 pages swap cached
Kernel panic - not syncing: __oom_kill_task called

------------[ cut here ]------------
kernel BUG at kernel/panic.c:75!
invalid operand: 0000 [#1]
Modules linked in: netconsole netdump stap_a48a9d50ed21c03a01970dd07bd4b2f2_392(U) md5 ipv6 parport_pc lp parport autofs4 sunrpc loop dm_multipath usb_storage button battery ac uhci_hcd ehci_hcd hw
_random snd_azx snd_hda_codec snd_pcm_oss snd_mixer_oss snd_pcm snd_timer snd soundcore snd_page_alloc tg3 dm_snapshot dm_zero dm_mirror ext3 jbd dm_mod ata_piix libata sd_mod scsi_mod
CPU:    1
EIP:    0060:[<c0122106>]    Not tainted VLI
EFLAGS: 00010086   (2.6.9-55.ELsmp) 
EIP is at panic+0x47/0x147
eax: 00000043   ebx: f401a200   ecx: f60b8cf0   edx: c02e774b
esi: f60b8dc4   edi: f60b8dc4   ebp: c2022120   esp: f60b8cf8
ds: 007b   es: 007b   ss: 0068
Process sshd (pid: 4265, threadinfo=f60b8000 task=f66ca330)
Stack: f401a200 f8aa2596 f8aa332b f401a2b4 f8aa25df f8aa7120 f8aa26da 00000000 
       00000000 00000000 cc9867cb 00000155 00000096 f401a200 c2022100 f8aa7120 
       f60b8dc4 c2022120 c011947b f89d3da0 f60b8000 c0143427 00000000 c032ae3c 


Sometimes it's useful to be able to panic a box when a particular event happens, or some condition becomes true. Post-mortem debugging from a memory image can be a powerful tool to understand a problem but it can be difficult, or require creation of custom kernel patches to trigger a crash at just the right moment. Systemtap allows this functionality to be added on-the-fly. Although this example chose to hook into the OOM killer routines the same basic idea can be adapted to many different problems.


None: WSPanicOnOom (last edited 2008-03-19 17:47:15 by BrynReeves)