This is the mail archive of the systemtap@sourceware.org mailing list for the systemtap project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[Bug runtime/12960] New: _stp_ctl_send tries to msleep when out of memory


http://sourceware.org/bugzilla/show_bug.cgi?id=12960

           Summary: _stp_ctl_send tries to msleep when out of memory
           Product: systemtap
           Version: unspecified
            Status: NEW
          Severity: normal
          Priority: P2
         Component: runtime
        AssignedTo: systemtap@sourceware.org
        ReportedBy: mjw@redhat.com


_stp_ctl_send tries to msleep when out of memory which seems to cause problems
and eventually kernel crashes. It isn't very reproducable, but the following
triggers it for me pretty often:

/usr/local/install/systemtap/bin/stap -d
/usr/lib64/python2.7/lib-dynload/_ssl.so -d /usr/lib64/libssl.so.1.0.0d -d
/lib64/libcrypto.so.1.0.0d -DMAXTRACE=128 -d /usr/bin/gdb --ldd -e 'probe
syscall.open { if (pid() == target()) { log(filename); print_ubacktrace();
log("--"); } }' -c 'gdb --version'

[ 1534.651888] stap_afa62ad505a7aaf8c957387db22ba031_16869: systemtap:
1.6/0.152, base: ffffffffa08a0000, memory: 6199data/35text/26ctx/13net/34alloc
kb, probes: 2
[ 1534.711077] ctl_send msleep because of err: -12
[ 1534.712955] BUG: scheduling while atomic: kworker/0:0/0/0x00000100
[ 1534.715360] Modules linked in: stap_afa62ad505a7aaf8c957387db22ba031_16869
uprobes netconsole configfs nfs lockd fscache nfs_acl auth_rpcgss sco bnep
l2cap bluetooth sunrpc rfkill ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6
ip6table_filter ip6_tables snd_intel8x0 snd_ac97_codec i2c_piix4 ac97_bus
snd_seq snd_seq_device snd_pcm snd_timer 8139too i2c_core 8139cp mii snd
soundcore snd_page_alloc virtio_balloon microcode ipv6 [last unloaded:
stap_a893656df50ecd18787fb7e563e535cc_16869]
[ 1534.731101] CPU 1 
[ 1534.731552] Modules linked in: stap_afa62ad505a7aaf8c957387db22ba031_16869
uprobes netconsole configfs nfs lockd fscache nfs_acl auth_rpcgss sco bnep
l2cap bluetooth sunrpc rfkill ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6
ip6table_filter ip6_tables snd_intel8x0 snd_ac97_codec i2c_piix4 ac97_bus
snd_seq snd_seq_device snd_pcm snd_timer 8139too i2c_core 8139cp mii snd
soundcore snd_page_alloc virtio_balloon microcode ipv6 [last unloaded:
stap_a893656df50ecd18787fb7e563e535cc_16869]
[ 1534.739225] 
[ 1534.739453] Pid: 0, comm: kworker/0:0 Not tainted 2.6.38.8-32.fc15.x86_64 #1
Bochs Bochs
[ 1534.740674] RIP: 0010:[<ffffffff8102a145>]  [<ffffffff8102a145>]
native_safe_halt+0xb/0xd
[ 1534.741835] RSP: 0018:ffff88007a81dee8  EFLAGS: 00000246
[ 1534.742524] RAX: 0000000000000000 RBX: ffffffff810b1374 RCX:
0000016553d82980
[ 1534.743451] RDX: 000000f800000000 RSI: 0000000000000001 RDI:
0000000000000001
[ 1534.744408] RBP: ffff88007a81dee8 R08: 0000000000000000 R09:
ffffffff81b3a320
[ 1534.745393] R10: 00000000008849e7 R11: ffff880078922a00 R12:
ffffffff8100a58e
[ 1534.746342] R13: ffff88007a81de68 R14: ffffffff81010150 R15:
ffff88007a81de48
[ 1534.747281] FS:  0000000000000000(0000) GS:ffff88007fc80000(0000)
knlGS:0000000000000000
[ 1534.748333] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[ 1534.749124] CR2: 0000003ceecab970 CR3: 0000000033c55000 CR4:
00000000000006e0
[ 1534.750072] DR0: 0000000000000000 DR1: 0000000000000000 DR2:
0000000000000000
[ 1534.750995] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7:
0000000000000400
[ 1534.751956] Process kworker/0:0 (pid: 0, threadinfo ffff88007a81c000, task
ffff88007a820000)
[ 1534.753099] Stack:
[ 1534.753400]  ffff88007a81def8 ffffffff81010d36 ffff88007a81df28
ffffffff81008321
[ 1534.754561]  ffff88007a81df18 ab7ab4059d172ee8 0000000000000000
0000000000000000
[ 1534.755772]  ffff88007a81df48 ffffffff81464dba 0000000000000000
e5c548fc55f4519c
[ 1534.756920] Call Trace:
[ 1534.757304]  [<ffffffff81010d36>] default_idle+0x4e/0x86
[ 1534.758022]  [<ffffffff81008321>] cpu_idle+0xa5/0xdf
[ 1534.758700]  [<ffffffff81464dba>] start_secondary+0x20c/0x20e
[ 1534.759446] Code: 1f 44 00 00 57 9d 5d c3 55 48 89 e5 0f 1f 44 00 00 fa 5d
c3 55 48 89 e5 0f 1f 44 00 00 fb 5d c3 55 48 89 e5 0f 1f 44 00 00 fb f4 <5d> c3
55 48 89 e5 0f 1f 44 00 00 f4 5d c3 55 48 89 e5 0f 1f 44 
[ 1534.765194] Call Trace:
[ 1534.765541]  [<ffffffff81010d36>] default_idle+0x4e/0x86
[ 1534.766260]  [<ffffffff81008321>] cpu_idle+0xa5/0xdf
[ 1534.766921]  [<ffffffff81464dba>] start_secondary+0x20c/0x20e
[ 1534.767683] bad: scheduling from the idle thread!
[ 1534.768298] Pid: 0, comm: kworker/0:0 Not tainted 2.6.38.8-32.fc15.x86_64 #1
[ 1534.769245] Call Trace:
[ 1534.769593]  <IRQ>  [<ffffffff810425b3>] dequeue_task_idle+0x29/0x35
[ 1534.770489]  [<ffffffff81047e92>] dequeue_task+0x85/0x94
[ 1534.771203]  [<ffffffff81047ecb>] deactivate_task+0x2a/0x32
[ 1534.771965]  [<ffffffff81473f16>] schedule+0x22b/0x66a
[ 1534.772630]  [<ffffffff81474752>] schedule_timeout+0xa7/0xde
[ 1534.773420]  [<ffffffff81060bb4>] ? process_timeout+0x0/0x10
[ 1534.774198]  [<ffffffff814747a7>] schedule_timeout_uninterruptible+0x1e/0x20
[ 1534.775158]  [<ffffffff810615dd>] msleep+0x1b/0x22
[ 1534.775814]  [<ffffffffa08a06b0>] _stp_ctl_send+0x3f/0x9c
[stap_afa62ad505a7aaf8c957387db22ba031_16869]
[ 1534.777013]  [<ffffffffa08a1099>] _stp_ctl_work_callback+0x81/0xa6
[stap_afa62ad505a7aaf8c957387db22ba031_16869]
[ 1534.778353]  [<ffffffff81061378>] run_timer_softirq+0x1a4/0x266
[ 1534.779145]  [<ffffffff81076a8c>] ? timekeeping_get_ns+0x18/0x3a
[ 1534.779924]  [<ffffffffa08a1018>] ? _stp_ctl_work_callback+0x0/0xa6
[stap_afa62ad505a7aaf8c957387db22ba031_16869]
[ 1534.781240]  [<ffffffff8105ae4c>] __do_softirq+0xd2/0x19d
[ 1534.782165]  [<ffffffff81072750>] ? hrtimer_interrupt+0x11a/0x1b5
[ 1534.783289]  [<ffffffff8100aadc>] call_softirq+0x1c/0x30
[ 1534.784387]  [<ffffffff8100c101>] do_softirq+0x46/0x81
[ 1534.785238]  [<ffffffff8105afd0>] irq_exit+0x49/0x8b
[ 1534.786143]  [<ffffffff8147c09b>] smp_apic_timer_interrupt+0x7e/0x8c
[ 1534.787246]  [<ffffffff8100a593>] apic_timer_interrupt+0x13/0x20
[ 1534.788754]  <EOI>  [<ffffffff810b1374>] ? rcu_needs_cpu+0x111/0x1c2
[ 1534.790608]  [<ffffffff8102a145>] ? native_safe_halt+0xb/0xd
[ 1534.792171]  [<ffffffff81010d36>] default_idle+0x4e/0x86
[ 1534.793133]  [<ffffffff81008321>] cpu_idle+0xa5/0xdf
[ 1534.793960]  [<ffffffff81464dba>] start_secondary+0x20c/0x20e
[ 1534.796124] BUG: unable to handle kernel NULL pointer dereference at        
  (null)
[ 1534.797096] IP: [<          (null)>]           (null)
[ 1534.797096] PGD 0 
[ 1534.797096] Oops: 0010 [#1] SMP 
[ 1534.797096] last sysfs file:
/sys/module/virtio_balloon/sections/__mcount_loc
[ 1534.797096] CPU 1 
[ 1534.797096] Modules linked in: stap_afa62ad505a7aaf8c957387db22ba031_16869
uprobes netconsole configfs nfs lockd fscache nfs_acl auth_rpcgss sco bnep
l2cap bluetooth sunrpc rfkill ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6
ip6table_filter ip6_tables snd_intel8x0 snd_ac97_codec i2c_piix4 ac97_bus
snd_seq snd_seq_device snd_pcm snd_timer 8139too i2c_core 8139cp mii snd
soundcore snd_page_alloc virtio_balloon microcode ipv6 [last unloaded:
stap_a893656df50ecd18787fb7e563e535cc_16869]
[ 1534.797096] 
[ 1534.797096] Pid: 686, comm: rs:main Q:Reg Not tainted
2.6.38.8-32.fc15.x86_64 #1 Bochs Bochs
[ 1534.797096] RIP: 0010:[<0000000000000000>]  [<          (null)>]          
(null)
[ 1534.797096] RSP: 0018:ffff8800789ff738  EFLAGS: 00010046
[ 1534.797096] RAX: ffffffff8160a4e0 RBX: ffff88007a820000 RCX:
ffff88007fc80000
[ 1534.797096] RDX: 0000000000000001 RSI: ffff88007a820000 RDI:
ffff88007fc93840
[ 1534.797096] RBP: ffff8800789ff760 R08: ffff88007fc8dbb0 R09:
000000000000024b
[ 1534.797096] R10: 0000000000000010 R11: ffff88007a820000 R12:
ffff88007fc93840
[ 1534.797096] R13: 0000000000000001 R14: 0000000000000001 R15:
0000000000000001
[ 1534.797096] FS:  00007ff1a617d700(0000) GS:ffff88007fc80000(0000)
knlGS:0000000000000000
[ 1534.797096] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 1534.797096] CR2: 0000000000000000 CR3: 000000007a0a8000 CR4:
00000000000006e0
[ 1534.797096] DR0: 0000000000000000 DR1: 0000000000000000 DR2:
0000000000000000
[ 1534.797096] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7:
0000000000000400
[ 1534.797096] Process rs:main Q:Reg (pid: 686, threadinfo ffff8800789fe000,
task ffff880037960000)
[ 1534.797096] Stack:
[ 1534.797096]  ffffffff81047f30 ffff8800789ff750 ffffffff00000001
ffff88007fc93840
[ 1534.797096]  ffff88007fc93840 ffff8800789ff780 ffffffff81047f69
ffff88007fc8dbb0
[ 1534.797096]  ffff88007a820000 ffff8800789ff7e0 ffffffff8104df47
ffffffff00000001
[ 1534.797096] Call Trace:
[ 1534.797096]  [<ffffffff81047f30>] ? enqueue_task+0x5d/0x6b
[ 1534.797096]  [<ffffffff81047f69>] activate_task+0x2b/0x33
[ 1534.797096]  [<ffffffff8104df47>] try_to_wake_up+0x1f7/0x226
[ 1534.797096]  [<ffffffff8102ac09>] ? pvclock_clocksource_read+0x48/0xb7
[ 1534.797096]  [<ffffffff8104df9f>] wake_up_process+0x15/0x17
[ 1534.797096]  [<ffffffff81060bc2>] process_timeout+0xe/0x10
[ 1534.797096]  [<ffffffff81061378>] run_timer_softirq+0x1a4/0x266
[ 1534.797096]  [<ffffffff81076a8c>] ? timekeeping_get_ns+0x18/0x3a
[ 1534.797096]  [<ffffffff81060bb4>] ? process_timeout+0x0/0x10
[ 1534.797096]  [<ffffffff8105ae4c>] __do_softirq+0xd2/0x19d
[ 1534.797096]  [<ffffffff81072750>] ? hrtimer_interrupt+0x11a/0x1b5
[ 1534.797096]  [<ffffffff8100aadc>] call_softirq+0x1c/0x30
[ 1534.797096]  [<ffffffff8100c101>] do_softirq+0x46/0x81
[ 1534.797096]  [<ffffffff8105afd0>] irq_exit+0x49/0x8b
[ 1534.797096]  [<ffffffff8147c09b>] smp_apic_timer_interrupt+0x7e/0x8c
[ 1534.797096]  [<ffffffff8100a593>] apic_timer_interrupt+0x13/0x20
[ 1534.797096]  [<ffffffff811f51ab>] ? avtab_search_node+0x69/0x7a
[ 1534.797096]  [<ffffffff811971b4>] ? ext4_mark_iloc_dirty+0x4db/0x543
[ 1534.797096]  [<ffffffff811fde0e>] ? cond_compute_av+0x26/0x8c
[ 1534.797096]  [<ffffffff811fa8af>] ? context_struct_compute_av+0x16f/0x257
[ 1534.797096]  [<ffffffff811fb4a9>] ? security_compute_av+0xf9/0x20d
[ 1534.797096]  [<ffffffff811e9d52>] ? avc_has_perm_noaudit+0x104/0x389
[ 1534.797096]  [<ffffffff811aaf93>] ? __ext4_journal_stop+0x76/0x7c
[ 1534.797096]  [<ffffffff811ea00a>] ? avc_has_perm+0x33/0x63
[ 1534.797096]  [<ffffffff811eb0eb>] ? inode_has_perm+0x76/0x8c
[ 1534.797096]  [<ffffffff8122c8bc>] ? radix_tree_lookup_slot+0xe/0x10
[ 1534.797096]  [<ffffffff8104127e>] ? should_resched+0xe/0x2d
[ 1534.797096]  [<ffffffff81474408>] ? _cond_resched+0xe/0x22
[ 1534.797096]  [<ffffffff810d9d64>] ? filemap_fault+0x20d/0x36c
[ 1534.797096]  [<ffffffff811ee75d>] ? selinux_inode_permission+0x82/0xa2
[ 1534.797096]  [<ffffffff811e7f4a>] ? security_inode_exec_permission+0x2a/0x2c
[ 1534.797096]  [<ffffffff81129ae2>] ? exec_permission+0x71/0x80
[ 1534.797096]  [<ffffffff8112b5e5>] ? link_path_walk+0x85/0x3b8
[ 1534.797096]  [<ffffffff8104127e>] ? should_resched+0xe/0x2d
[ 1534.797096]  [<ffffffff8112ac3d>] ? path_init_rcu+0x87/0x192
[ 1534.797096]  [<ffffffff812324e1>] ? might_fault+0x21/0x23
[ 1534.797096]  [<ffffffff8112bb4b>] ? do_path_lookup+0x4d/0xf6
[ 1534.797096]  [<ffffffff8112c810>] ? user_path_at+0x57/0x94
[ 1534.797096]  [<ffffffff811131d7>] ? __kmalloc_track_caller+0xf7/0x109
[ 1534.797096]  [<ffffffff810ec1ef>] ? kmemdup+0x20/0x35
[ 1534.797096]  [<ffffffff811ed04e>] ? selinux_cred_prepare+0x1c/0x32
[ 1534.797096]  [<ffffffff81074263>] ? override_creds+0x28/0x3d
[ 1534.797096]  [<ffffffff811205be>] ? sys_faccessat+0xa0/0x162
[ 1534.797096]  [<ffffffff81120698>] ? sys_access+0x18/0x1a
[ 1534.797096]  [<ffffffff81009bc2>] ? system_call_fastpath+0x16/0x1b
[ 1534.797096] Code:  Bad RIP value.
[ 1534.797096] RIP  [<          (null)>]           (null)
[ 1534.797096]  RSP <ffff8800789ff738>
[ 1534.797096] CR2: 0000000000000000
[ 1534.797096] ---[ end trace 1b2381b9c932a61a ]---
[ 1534.797096] Kernel panic - not syncing: Fatal exception in interrupt
[ 1534.797096] Pid: 686, comm: rs:main Q:Reg Tainted: G      D    
2.6.38.8-32.fc15.x86_64 #1
[ 1534.797096] Call Trace:
[ 1534.797096]  [<ffffffff8146c6e6>] panic+0x91/0x19c
[ 1534.797096]  [<ffffffff81476cc6>] oops_end+0xb4/0xc5
[ 1534.797096]  [<ffffffff8146c06e>] no_context+0x203/0x212
[ 1534.797096]  [<ffffffff8146c211>] __bad_area_nosemaphore+0x194/0x1b7
[ 1534.797096]  [<ffffffff810d1e17>] ? __perf_event_task_sched_out+0x27/0x2c
[ 1534.797096]  [<ffffffff8146c247>] bad_area_nosemaphore+0x13/0x15
[ 1534.797096]  [<ffffffff81478d9d>] do_page_fault+0x1c5/0x37a
[ 1534.797096]  [<ffffffff814761d5>] page_fault+0x25/0x30
[ 1534.797096]  [<ffffffff81047f30>] ? enqueue_task+0x5d/0x6b
[ 1534.797096]  [<ffffffff81047f69>] activate_task+0x2b/0x33
[ 1534.797096]  [<ffffffff8104df47>] try_to_wake_up+0x1f7/0x226
[ 1534.797096]  [<ffffffff8102ac09>] ? pvclock_clocksource_read+0x48/0xb7
[ 1534.797096]  [<ffffffff8104df9f>] wake_up_process+0x15/0x17
[ 1534.797096]  [<ffffffff81060bc2>] process_timeout+0xe/0x10
[ 1534.797096]  [<ffffffff81061378>] run_timer_softirq+0x1a4/0x266
[ 1534.797096]  [<ffffffff81076a8c>] ? timekeeping_get_ns+0x18/0x3a
[ 1534.797096]  [<ffffffff81060bb4>] ? process_timeout+0x0/0x10
[ 1534.797096]  [<ffffffff8105ae4c>] __do_softirq+0xd2/0x19d
[ 1534.797096]  [<ffffffff81072750>] ? hrtimer_interrupt+0x11a/0x1b5
[ 1534.797096]  [<ffffffff8100aadc>] call_softirq+0x1c/0x30
[ 1534.797096]  [<ffffffff8100c101>] do_softirq+0x46/0x81
[ 1534.797096]  [<ffffffff8105afd0>] irq_exit+0x49/0x8b
[ 1534.797096]  [<ffffffff8147c09b>] smp_apic_timer_interrupt+0x7e/0x8c
[ 1534.797096]  [<ffffffff8100a593>] apic_timer_interrupt+0x13/0x20
[ 1534.797096]  [<ffffffff811f51ab>] ? avtab_search_node+0x69/0x7a
[ 1534.797096]  [<ffffffff811971b4>] ? ext4_mark_iloc_dirty+0x4db/0x543
[ 1534.797096]  [<ffffffff811fde0e>] ? cond_compute_av+0x26/0x8c
[ 1534.797096]  [<ffffffff811fa8af>] ? context_struct_compute_av+0x16f/0x257
[ 1534.797096]  [<ffffffff811fb4a9>] ? security_compute_av+0xf9/0x20d
[ 1534.797096]  [<ffffffff811e9d52>] ? avc_has_perm_noaudit+0x104/0x389
[ 1534.797096]  [<ffffffff811aaf93>] ? __ext4_journal_stop+0x76/0x7c
[ 1534.797096]  [<ffffffff811ea00a>] ? avc_has_perm+0x33/0x63
[ 1534.797096]  [<ffffffff811eb0eb>] ? inode_has_perm+0x76/0x8c
[ 1534.797096]  [<ffffffff8122c8bc>] ? radix_tree_lookup_slot+0xe/0x10
[ 1534.797096]  [<ffffffff8104127e>] ? should_resched+0xe/0x2d
[ 1534.797096]  [<ffffffff81474408>] ? _cond_resched+0xe/0x22
[ 1534.797096]  [<ffffffff810d9d64>] ? filemap_fault+0x20d/0x36c
[ 1534.797096]  [<ffffffff811ee75d>] ? selinux_inode_permission+0x82/0xa2
[ 1534.797096]  [<ffffffff811e7f4a>] ? security_inode_exec_permission+0x2a/0x2c
[ 1534.797096]  [<ffffffff81129ae2>] ? exec_permission+0x71/0x80
[ 1534.797096]  [<ffffffff8112b5e5>] ? link_path_walk+0x85/0x3b8
[ 1534.797096]  [<ffffffff8104127e>] ? should_resched+0xe/0x2d
[ 1534.797096]  [<ffffffff8112ac3d>] ? path_init_rcu+0x87/0x192
[ 1534.797096]  [<ffffffff812324e1>] ? might_fault+0x21/0x23
[ 1534.797096]  [<ffffffff8112bb4b>] ? do_path_lookup+0x4d/0xf6
[ 1534.797096]  [<ffffffff8112c810>] ? user_path_at+0x57/0x94
[ 1534.797096]  [<ffffffff811131d7>] ? __kmalloc_track_caller+0xf7/0x109
[ 1534.797096]  [<ffffffff810ec1ef>] ? kmemdup+0x20/0x35
[ 1534.797096]  [<ffffffff811ed04e>] ? selinux_cred_prepare+0x1c/0x32
[ 1534.797096]  [<ffffffff81074263>] ? override_creds+0x28/0x3d
[ 1534.797096]  [<ffffffff811205be>] ? sys_faccessat+0xa0/0x162
[ 1534.797096]  [<ffffffff81120698>] ? sys_access+0x18/0x1a
[ 1534.797096]  [<ffffffff81009bc2>] ? system_call_fastpath+0x16/0x1b

This is on f15 (2.6.38.8-32.fc15.x86_64), but I have also seen it happen on f14
(2.6.35.13-92.fc14.x86_64).

I am using the following workaround atm, but this only seems to work because we
never run out of buffers now (at least in my environment):

diff --git a/runtime/transport/debugfs.c b/runtime/transport/debugfs.c
index 6bbef53..0897fe5 100644
--- a/runtime/transport/debugfs.c
+++ b/runtime/transport/debugfs.c
@@ -12,7 +12,7 @@
 #include <linux/debugfs.h>
 #include "transport.h"

-#define STP_DEFAULT_BUFFERS 50
+#define STP_DEFAULT_BUFFERS 1024

 inline static int _stp_ctl_write_fs(int type, void *data, unsigned len)
 {

-- 
Configure bugmail: http://sourceware.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]