The kernel can panic when profiling stack traces on Fedora 16. window1 (sample workload): # while :; do tar cf /dev/null /usr; done window2 (profiling): # stap -D MAXTRACE=100 -D MAXSTRINGLEN=4096 -ve 'global s; probe timer.profile { s[backtrace()] <<< 1; } probe end { foreach (i in s+) { print_stack(i); printf("\t%d\n", @count(s[i])); } }' This was tried about a dozen times, and paniced on four of them. The system had kdump configured and collected a crash dump. Some details: # crash /var/crash/2012-02-19-23\:08/vmcore /usr/lib/debug/lib/modules/3.2.6-3.fc16.x86_64/vmlinux [...] KERNEL: /usr/lib/debug/lib/modules/3.2.6-3.fc16.x86_64/vmlinux DUMPFILE: /var/crash/2012-02-19-23:08/vmcore CPUS: 1 DATE: Sun Feb 19 15:04:50 2012 UPTIME: 00:05:41 LOAD AVERAGE: 1.39, 0.78, 0.32 TASKS: 72 NODENAME: 9d219ce8-cf52-409f-a14a-b210850f3231 RELEASE: 3.2.6-3.fc16.x86_64 VERSION: #1 SMP Mon Feb 13 20:35:42 UTC 2012 MACHINE: x86_64 (2393 Mhz) MEMORY: 1 GB PANIC: "[ 341.694043] Oops: 0000 [#1] SMP " (check log for details) PID: 3922 COMMAND: "tar" TASK: ffff880037878000 [THREAD_INFO: ffff88001783e000] CPU: 0 STATE: TASK_RUNNING (PANIC) [...] crash> bt -a PID: 3922 TASK: ffff880037878000 CPU: 0 COMMAND: "tar" #0 [ffff88003fc037b0] machine_kexec at ffffffff81038bda #1 [ffff88003fc03820] crash_kexec at ffffffff810b8482 #2 [ffff88003fc038f0] oops_end at ffffffff815e2b08 #3 [ffff88003fc03920] no_context at ffffffff815d6a05 #4 [ffff88003fc03980] __bad_area_nosemaphore at ffffffff815d6bdd #5 [ffff88003fc039e0] bad_area_nosemaphore at ffffffff815d6c0f #6 [ffff88003fc039f0] do_page_fault at ffffffff815e55e6 #7 [ffff88003fc03b00] page_fault at ffffffff815e20b5 [exception RIP: get_uleb128+83] RIP: ffffffffa00773f3 RSP: ffff88003fc03bb8 RFLAGS: 00010046 RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000 RDX: 0000000000000007 RSI: ffffffffffffffff RDI: ffff88003fc03bf0 RBP: ffff88003fc03bb8 R8: 0000000000000030 R9: 0000000000000007 R10: 0000000000000001 R11: 0000000000000040 R12: ffff88003aae1088 R13: ffff88001783ffd8 R14: 0000000000000000 R15: ffff88003aae1088 ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0000 #8 [ffff88003fc03bc0] compute_expr at ffffffffa007c63f [stap_ddfda8cbbf4994e727c76bf533732ecf_3874] #9 [ffff88003fc03c70] unwind_frame at ffffffffa007df6c [stap_ddfda8cbbf4994e727c76bf533732ecf_3874] #10 [ffff88003fc03dc0] probe_1993 at ffffffffa007fcc6 [stap_ddfda8cbbf4994e727c76bf533732ecf_3874] #11 [ffff88003fc03e40] enter_profile_probes at ffffffffa007b2c5 [stap_ddfda8cbbf4994e727c76bf533732ecf_3874] #12 [ffff88003fc03e70] profile_tick at ffffffff81098ea9 #13 [ffff88003fc03e90] tick_sched_timer at ffffffff810a0e4e #14 [ffff88003fc03ec0] __run_hrtimer at ffffffff81093e90 #15 [ffff88003fc03f20] hrtimer_interrupt at ffffffff8109480b #16 [ffff88003fc03f90] smp_apic_timer_interrupt at ffffffff815ec589 #17 [ffff88003fc03fb0] apic_timer_interrupt at ffffffff815ea45e --- <IRQ stack> --- #18 [ffff88001783fed0] apic_timer_interrupt at ffffffff815ea45e RIP: ffffffffffffff10 RSP: 0000000000000203 RFLAGS: 00000010 RAX: ffff88003fc13780 RBX: 00000000008c2ed0 RCX: 0000000000000000 RDX: ffff88001783e000 RSI: 0000000000000008 RDI: ffff88001783ff78 RBP: ffffffff8119691f R8: 0000000000000001 R9: 0000000000000001 R10: 0000000000000001 R11: ffffffff811839f2 R12: ffff88001783fef8 R13: 00007fff549ffff0 R14: ffff88001783ff40 R15: ffffffff81196824 ORIG_RAX: 000000000000fe2e CS: ffffffff815e1e8d SS: ffff88001783ff88 bt: WARNING: possibly bogus exception frame crash> dis get_uleb128+83 0xffffffffa00773f3 <get_uleb128+83>: movzbl (%rdx),%r10d Version and system details: # stap -V Systemtap translator/driver (version 1.6/0.152 non-git sources) Copyright (C) 2005-2011 Red Hat, Inc. and others This is free software; see the source for copying conditions. enabled features: AVAHI LIBRPM LIBSQLITE3 NSS BOOST_SHARED_PTR TR1_UNORDERED_MAP NLS # uname -a Linux 9d219ce8-cf52-409f-a14a-b210850f3231 3.2.6-3.fc16.x86_64 #1 SMP Mon Feb 13 20:35:42 UTC 2012 x86_64 x86_64 x86_64 GNU/Linux # cat /etc/redhat-release Fedora release 16 (Verne) This is Fedora as a guest running under KVM. SystemTap was installed via yum.
After some poking (also needed to increase MAXMAPENTRIES) I could finally replicate it on 3.3.0-0.rc1.git6.1.fc17.x86_64 with systemtap-devel-1.7-1.fc17.x86_64 [ 1305.976762] stap_e93c7971f27dbaecad41f45add08f2ea_2319: systemtap: 1.7/0.152, base: ffffffffa0662000, memory: 4291data/40text/86ctx/2058net/244893alloc kb, probes: 2 [ 1309.808090] BUG: unable to handle kernel NULL pointer dereference at 0000000000000007 [ 1309.809029] IP: [<ffffffffa06620c4>] get_uleb128+0x54/0x80 [stap_e93c7971f27dbaecad41f45add08f2ea_2319]
Found the root cause of this issue. It happens when a DW_CFA operation that defines the CFA as dwarf expression is followed by a DW_CFA operation that (re)defined the CFA as register+offset. In that case we forgot the reset the REG_STATE.cfa_is_expr flag which made compute_expr() interpret the reg/offset as expr pointer (because they share their values in a union). While adding more sanity checks to make sure we catch such issues I found what looks like bad CFI in the x86_64 kernel in common_interrupt (arch/x86/kernel/entry_64.S) which defines CFI "by hand" and has a CFI_DEF_CFA_REGISTER following a def_cfa_expression, which is invalid.
Created attachment 6233 [details] Make sure REG_STATE.cfa_is_expr is always set correctly add more sanity checks Testing the following patch on the systemtap runtime side.
Created attachment 6234 [details] x86 entry_64.S cfi fixup This is the kernel patch that fixes the cfi for common_interrupt on x86_64.
commit 64b0cff3bee6b00cb4193ed887439c66055f85b4 Author: Mark Wielaard <mjw@redhat.com> Date: Tue Feb 21 15:08:58 2012 +0100 PR13714 - Make sure REG_STATE.cfa_is_expr is always set correctly. runtime/unwind.c (processCFI): Always set REG_STATE.cfa_is_expr and add new sanity checks to make sure the cfa definition rules are sane. Since the cfa expr pointer and cfa register/offset rule shared a union not setting REG_STATE.cfa_is_expr could result in compute_expr () wrongly being called and using the register/offset as expr pointer. Kernel patch send upstream: https://lkml.org/lkml/2012/2/21/154 Note that with the systemtap runtime patch any wrong DW_CFA_def sequence is detected. So you don't need an updated kernel. You do need the systemtap runtime patch however also with a kernel that has the kernel-cfi-fix patch applied. There could be other (correct) CFI sequences that did trigger the bug.
Great - thanks for the quick fix!
*** Bug 260998 has been marked as a duplicate of this bug. *** Seen from the domain http://volichat.com Page where seen: http://volichat.com/adult-chat-rooms Marked for reference. Resolved as fixed @bugzilla.