Bug 20735 - "soft lockup" bug on RHEL7 ppc64
Summary: "soft lockup" bug on RHEL7 ppc64
Status: RESOLVED FIXED
Alias: None
Product: systemtap
Classification: Unclassified
Component: runtime (show other bugs)
Version: unspecified
: P2 normal
Target Milestone: ---
Assignee: Unassigned
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2016-10-24 19:50 UTC by David Smith
Modified: 2023-05-08 12:43 UTC (History)
1 user (show)

See Also:
Host:
Target:
Build:
Last reconfirmed:
Project(s) to access:
ssh public key:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description David Smith 2016-10-24 19:50:57 UTC
I'm seeing the following "soft lockup" bug fairly consistently when running the testsuite in parallel mode on RHEL7 ppc64 (3.10.0-327.el7.ppc64.debug):

====
Oct 22 14:43:55 ibm-p8-01-lp7.lab.eng.rdu.redhat.com kernel: BUG: soft lockup - CPU#11 stuck for 31s! [migration/11:94]
Oct 22 14:43:55 ibm-p8-01-lp7 kernel: Modules linked in: stap_0ea83aa31510554f18d088ec7736640f__5685(OE+) stap_fae50114ebc4a4659eb3c14ebac75168_5108(OE+) stap_16af15cbdec255c8ee6892d56fef69df_8_4819(OE) stap_bcdff2ed83de7602b65808ca8e0bf37d__4756(OE) sg pseries_rng nfsd auth_rpcgss nfs_acl lockd grace sunrpc ip_tables xfs libcrc32c sd_mod crc_t10dif crct10dif_generic crct10dif_common ibmvscsi scsi_transport_srp ibmveth scsi_tgt dm_mirror dm_region_hash dm_log dm_mod [last unloaded: stap_a35e54245d1be418047481fe89e36ea1__1925]
Oct 22 14:43:55 ibm-p8-01-lp7 kernel: irq event stamp: 121576
Oct 22 14:43:55 ibm-p8-01-lp7 kernel: hardirqs last  enabled at (121575): [<c000000000a1a148>] ._raw_spin_unlock_irqrestore+0x48/0xc0
Oct 22 14:43:55 ibm-p8-01-lp7 kernel: hardirqs last disabled at (121576): [<c00000000000290c>] decrementer_common+0x10c/0x180
Oct 22 14:43:55 ibm-p8-01-lp7 kernel: softirqs last  enabled at (120802): [<c0000000000e27d0>] .__do_softirq+0x220/0x5b0
Oct 22 14:43:55 ibm-p8-01-lp7 kernel: softirqs last disabled at (120797): [<c00000000002703c>] .call_do_softirq+0x14/0x24
Oct 22 14:43:55 ibm-p8-01-lp7 kernel: CPU: 11 PID: 94 Comm: migration/11 Tainted: G           OE  ------------   3.10.0-327.el7.ppc64.debug #1
Oct 22 14:43:55 ibm-p8-01-lp7 kernel: task: c0000005f4ec7880 ti: c0000005f4fd8000 task.ti: c0000005f4fd8000
Oct 22 14:43:55 ibm-p8-01-lp7 kernel: NIP: c000000000010c50 LR: c000000000010c50 CTR: c000000000020940
Oct 22 14:43:55 ibm-p8-01-lp7 kernel: REGS: c0000005f4fdb810 TRAP: 0901   Tainted: G           OE  ------------    (3.10.0-327.el7.ppc64.debug)
Oct 22 14:43:55 ibm-p8-01-lp7 kernel: MSR: 8000000100009032 <SF,EE,ME,IR,DR,RI>  CR: 24000024  XER: 00000000
Oct 22 14:43:55 ibm-p8-01-lp7 kernel: CFAR: c000000000020970 SOFTE: 1
                                                             GPR00: c000000000a1a154 c0000005f4fdba90 c000000001a1c730 0000000000000900
                                                             GPR04: 0000000000000001 0000000000000001 0350280000000000 00000000040b4432
                                                             GPR08: 0000000000000000 000000000000119a 000fe0d0d288f434 d00000000d9fa1b0
                                                             GPR12: 0000000022000824 00003fff964c8da0
Oct 22 14:43:55 ibm-p8-01-lp7 kernel: NIP [c000000000010c50] .arch_local_irq_restore+0x100/0x150
Oct 22 14:43:55 ibm-p8-01-lp7 kernel: LR [c000000000010c50] .arch_local_irq_restore+0x100/0x150
Oct 22 14:43:55 ibm-p8-01-lp7 kernel: Call Trace:
Oct 22 14:43:55 ibm-p8-01-lp7 kernel: [c0000005f4fdbb00] [c000000000a1a154] ._raw_spin_unlock_irqrestore+0x54/0xc0
Oct 22 14:43:55 ibm-p8-01-lp7 kernel: [c0000005f4fdbb80] [c0000000001d77ac] .cpu_stop_should_run+0x5c/0x80
Oct 22 14:43:55 ibm-p8-01-lp7 kernel: [c0000005f4fdbc00] [c0000000001313d4] .smpboot_thread_fn+0x244/0x340
Oct 22 14:43:55 ibm-p8-01-lp7 kernel: [c0000005f4fdbcb0] [c00000000011f468] .kthread+0xe8/0xf0
Oct 22 14:43:55 ibm-p8-01-lp7 kernel: [c0000005f4fdbe30] [c00000000000a47c] .ret_from_kernel_thread+0x58/0xdc
Oct 22 14:43:55 ibm-p8-01-lp7 kernel: Instruction dump:
Oct 22 14:43:55 ibm-p8-01-lp7 kernel: e8690010 786307e2 3063ffff 7c631910 7863b7a6 78635624 38630e80 4bffff6c
Oct 22 14:43:55 ibm-p8-01-lp7 kernel: 7c0802a6 f8010010 f821ff91 4bff1769 <60000000> 38210070 e8010010 7c0803a6
Oct 22 14:45:09 ibm-p8-01-lp7 kernel: INFO: task kworker/11:2:381 blocked for more than 120 seconds.
Oct 22 14:45:09 ibm-p8-01-lp7 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Oct 22 14:45:09 ibm-p8-01-lp7 kernel: kworker/11:2    D 0000000000000000  8976   381      2 0x00000800
Oct 22 14:45:09 ibm-p8-01-lp7 kernel: Workqueue: events .module_refresher [stap_16af15cbdec255c8ee6892d56fef69df_8_4819]
Oct 22 14:45:09 ibm-p8-01-lp7 kernel: Call Trace:
Oct 22 14:45:09 ibm-p8-01-lp7 kernel: [c0000005e9eeb520] [d00000000dfb0000] 0xd00000000dfb0000 (unreliable)
Oct 22 14:45:09 ibm-p8-01-lp7 kernel: [c0000005e9eeb6f0] [c00000000001835c] .__switch_to+0x26c/0x4b0
Oct 22 14:45:09 ibm-p8-01-lp7 kernel: [c0000005e9eeb7a0] [c000000000a164a4] .__schedule+0x4e4/0xd10
Oct 22 14:45:09 ibm-p8-01-lp7 kernel: [c0000005e9eeb8c0] [c000000000a18130] .schedule_preempt_disabled+0x40/0xc0
Oct 22 14:45:09 ibm-p8-01-lp7 kernel: [c0000005e9eeb940] [c000000000a11b70] .mutex_lock_nested+0x240/0x5b0
Oct 22 14:45:09 ibm-p8-01-lp7 kernel: [c0000005e9eeba50] [d00000000d924970] .systemtap_module_refresh+0x160/0x650 [stap_16af15cbdec255c8ee6892d56fef69df_8_4819]
Oct 22 14:45:09 ibm-p8-01-lp7 kernel: [c0000005e9eebb00] [c00000000011035c] .process_one_work+0x25c/0x860
Oct 22 14:45:09 ibm-p8-01-lp7 kernel: [c0000005e9eebbe0] [c000000000110ad0] .worker_thread+0x170/0x490
Oct 22 14:45:09 ibm-p8-01-lp7 kernel: [c0000005e9eebcb0] [c00000000011f468] .kthread+0xe8/0xf0
Oct 22 14:45:09 ibm-p8-01-lp7 kernel: [c0000005e9eebe30] [c00000000000a47c] .ret_from_kernel_thread+0x58/0xdc
Oct 22 14:45:09 ibm-p8-01-lp7 kernel: 4 locks held by kworker/11:2/381:
Oct 22 14:45:09 ibm-p8-01-lp7 kernel:  #0:  (events){++++.+}, at: [<c0000000001102c8>] .process_one_work+0x1c8/0x860
Oct 22 14:45:09 ibm-p8-01-lp7 kernel:  #1:  ((&module_refresher_work)){+.+...}, at: [<c0000000001102c8>] .process_one_work+0x1c8/0x860
Oct 22 14:45:09 ibm-p8-01-lp7 kernel:  #2:  (module_refresh_mutex){+.+...}, at: [<d00000000d92485c>] .systemtap_module_refresh+0x4c/0x650 [stap_16af15cbdec255c8ee6892d56fef69df_8_4819]
Oct 22 14:45:09 ibm-p8-01-lp7 kernel:  #3:  (&ut->inode_lock#2){+.+.+.}, at: [<d00000000d924970>] .systemtap_module_refresh+0x160/0x650 [stap_16af15cbdec255c8ee6892d56fef69df_8_4819]
Oct 22 14:45:09 ibm-p8-01-lp7 kernel: INFO: task echo:5102 blocked for more than 120 seconds.
Oct 22 14:45:09 ibm-p8-01-lp7 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Oct 22 14:45:09 ibm-p8-01-lp7 kernel: echo            D 00003fff93f43cc0  8240  5102  19520 0x0000a082
Oct 22 14:45:09 ibm-p8-01-lp7 kernel: Call Trace:
Oct 22 14:45:09 ibm-p8-01-lp7 kernel: [c0000005475e3440] [d00000000dfb0000] 0xd00000000dfb0000 (unreliable)
Oct 22 14:45:09 ibm-p8-01-lp7 kernel: [c0000005475e3610] [c00000000001835c] .__switch_to+0x26c/0x4b0
Oct 22 14:45:09 ibm-p8-01-lp7 kernel: [c0000005475e36c0] [c000000000a164a4] .__schedule+0x4e4/0xd10
Oct 22 14:45:09 ibm-p8-01-lp7 kernel: [c0000005475e37e0] [c000000000a18130] .schedule_preempt_disabled+0x40/0xc0
Oct 22 14:45:09 ibm-p8-01-lp7 kernel: [c0000005475e3860] [c000000000a11b70] .mutex_lock_nested+0x240/0x5b0
Oct 22 14:45:09 ibm-p8-01-lp7 kernel: [c0000005475e3970] [d00000000d927fa8] .stapiu_process_found+0x208/0x660 [stap_16af15cbdec255c8ee6892d56fef69df_8_4819]
Oct 22 14:45:09 ibm-p8-01-lp7 kernel: [c0000005475e3a30] [d00000000d9268a4] .__stp_tf_quiesce_worker+0x154/0x230 [stap_16af15cbdec255c8ee6892d56fef69df_8_4819]
Oct 22 14:45:09 ibm-p8-01-lp7 kernel: [c0000005475e3ad0] [c000000000119fc4] .task_work_run+0x104/0x160
Oct 22 14:45:09 ibm-p8-01-lp7 kernel: [c0000005475e3b70] [c0000000000fc74c] .get_signal_to_deliver+0xc9c/0xdd0
Oct 22 14:45:09 ibm-p8-01-lp7 kernel: [c0000005475e3c70] [c00000000001a234] .do_signal+0x54/0x320
Oct 22 14:45:09 ibm-p8-01-lp7 kernel: [c0000005475e3db0] [c00000000001a65c] .do_notify_resume+0x8c/0x100
Oct 22 14:45:09 ibm-p8-01-lp7 kernel: [c0000005475e3e30] [c00000000000a7b0] .ret_from_except_lite+0x5c/0x60
Oct 22 14:45:09 ibm-p8-01-lp7 kernel: 1 lock held by echo/5102:
Oct 22 14:45:09 ibm-p8-01-lp7 kernel:  #0:  (&ut->inode_lock#2){+.+.+.}, at: [<d00000000d927fa8>] .stapiu_process_found+0x208/0x660 [stap_16af15cbdec255c8ee6892d56fef69df_8_4819]
Oct 22 14:45:09 ibm-p8-01-lp7 kernel: INFO: task staprun:5108 blocked for more than 120 seconds.
Oct 22 14:45:09 ibm-p8-01-lp7 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Oct 22 14:45:09 ibm-p8-01-lp7 kernel: staprun         D 00003fff8d298318  7824  5108  29619 0x00008080
Oct 22 14:45:09 ibm-p8-01-lp7 kernel: Call Trace:
Oct 22 14:45:09 ibm-p8-01-lp7 kernel: [c0000005809fb3c0] [d00000000de70000] 0xd00000000de70000 (unreliable)
Oct 22 14:45:09 ibm-p8-01-lp7 kernel: [c0000005809fb590] [c00000000001835c] .__switch_to+0x26c/0x4b0
Oct 22 14:45:09 ibm-p8-01-lp7 kernel: [c0000005809fb640] [c000000000a164a4] .__schedule+0x4e4/0xd10
Oct 22 14:45:09 ibm-p8-01-lp7 kernel: [c0000005809fb760] [c000000000a18130] .schedule_preempt_disabled+0x40/0xc0
Oct 22 14:45:09 ibm-p8-01-lp7 kernel: [c0000005809fb7e0] [c000000000a11b70] .mutex_lock_nested+0x240/0x5b0
Oct 22 14:45:09 ibm-p8-01-lp7 kernel: [c0000005809fb8f0] [d00000000d92485c] .systemtap_module_refresh+0x4c/0x650 [stap_16af15cbdec255c8ee6892d56fef69df_8_4819]
Oct 22 14:45:09 ibm-p8-01-lp7 kernel: [c0000005809fb9a0] [d00000000d92521c] ._stp_module_notifier+0x6c/0x170 [stap_16af15cbdec255c8ee6892d56fef69df_8_4819]
Oct 22 14:45:09 ibm-p8-01-lp7 kernel: [c0000005809fba30] [c000000000a20c18] .notifier_call_chain+0xb8/0x200
Oct 22 14:45:09 ibm-p8-01-lp7 kernel: [c0000005809fbae0] [c00000000012bffc] .blocking_notifier_call_chain+0x7c/0x100
Oct 22 14:45:09 ibm-p8-01-lp7 kernel: [c0000005809fbb70] [c0000000001b1f50] .load_module+0x1840/0x20d0
Oct 22 14:45:09 ibm-p8-01-lp7 kernel: [c0000005809fbd40] [c0000000001b28d0] .SyS_init_module+0xf0/0x170
Oct 22 14:45:09 ibm-p8-01-lp7 kernel: [c0000005809fbe30] [c00000000000a188] system_call+0x38/0xb4
Oct 22 14:45:09 ibm-p8-01-lp7 kernel: 2 locks held by staprun/5108:
Oct 22 14:45:09 ibm-p8-01-lp7 kernel:  #0:  ((module_notify_list).rwsem){++++.+}, at: [<c00000000012bfe0>] .blocking_notifier_call_chain+0x60/0x100
Oct 22 14:45:09 ibm-p8-01-lp7 kernel:  #1:  (module_refresh_mutex){+.+...}, at: [<d00000000d92485c>] .systemtap_module_refresh+0x4c/0x650 [stap_16af15cbdec255c8ee6892d56fef69df_8_4819]
Oct 22 14:45:29 ibm-p8-01-lp7 kernel: INFO: rcu_sched detected stalls on CPUs/tasks: { 11} (detected by 1, t=6503 jiffies, g=7870, c=7869, q=0)
Oct 22 14:45:29 ibm-p8-01-lp7 kernel: Task dump for CPU 11:
Oct 22 14:45:29 ibm-p8-01-lp7 kernel: swapper/11      R  running task    10560     0      1 0x00000804
Oct 22 14:45:29 ibm-p8-01-lp7 kernel: Call Trace:
Oct 22 14:45:29 ibm-p8-01-lp7 kernel: [c0000005f4a539a0] [0000000000000058] 0x58 (unreliable)
Oct 22 14:46:55 ibm-p8-01-lp7 kernel: BUG: soft lockup - CPU#11 stuck for 65s! [swapper/11:0]
Oct 22 14:46:55 ibm-p8-01-lp7 kernel: Modules linked in: stap_011c8d5b3f3c577d0cebde949ae1e2fe_8_6219(OE+) stap_0ea83aa31510554f18d088ec7736640f__5685(OE+) stap_fae50114ebc4a4659eb3c14ebac75168_5108(OE+) stap_16af15cbdec255c8ee6892d56fef69df_8_4819(OE) stap_bcdff2ed83de7602b65808ca8e0bf37d__4756(OE) sg pseries_rng nfsd auth_rpcgss nfs_acl lockd grace sunrpc ip_tables xfs libcrc32c sd_mod crc_t10dif crct10dif_generic crct10dif_common ibmvscsi scsi_transport_srp ibmveth scsi_tgt dm_mirror dm_region_hash dm_log dm_mod [last unloaded: stap_a35e54245d1be418047481fe89e36ea1__1925]
=====
Comment 1 David Smith 2016-10-25 21:17:14 UTC
I've been looking at this lockup, and my suspicion is that the problem lies with the systemtap runtime grabbing the module_mutex. The module_mutex is exported by the kernel, and the systemtap runtime grabs that mutex when calling kallsyms_on_each_symbol(). In the lockup below, there are at least 5 systemtap modules loaded simultaneously. When the last systemtap module got loaded, that triggered the other 4 to get notified that a new module was loaded. That causes the other 4 to try to grab the module_mutex to update module symbols (via stapkp_refresh() in runtime/linux/kprobes.c). In addition, the last one loaded could be calling stapkp_init() (runtime/linux/kprobes.c) which also tries to grab the module mutex.

So, my suspicion is that all 5 of those modules are trying to grab that same mutex at the same time. Plus, kallsyms_on_each_symbol() isn't quick, since it calls a function on every kernel symbol.

Here's the comment before the module_mutex definition in kernel/module.c:

/*
 * Mutex protects:
 * 1) List of modules (also safely readable with preempt_disable),
 * 2) module_use links,
 * 3) module_addr_min/module_addr_max.
 * (delete uses stop_machine/add uses RCU list operations). */

Based on that comment, I'm going to try to see what happens if we disable preemption before calling kallsyms_on_each_symbol() and enabling preemption afterwards instead of using module_mutex.
Comment 2 David Smith 2016-11-03 14:55:43 UTC
Unfortunately, this patch causes the at_var.exp test to get hung, so I'll need to do more work here.
Comment 3 David Smith 2016-11-10 16:02:18 UTC
(In reply to David Smith from comment #2)
> Unfortunately, this patch causes the at_var.exp test to get hung, so I'll
> need to do more work here.

That comment was supposed to go on PR20742.

This "soft lockup" bug is fixed in commit 58c9bd1 by using the solution mentioned in comment #1 of avoiding the module mutex and disabling/enable preemption around kallsyms_on_each_symbol() calls instead.
Comment 4 agentzh 2023-05-06 21:40:25 UTC
David, seems like your patch causes CPU lockups with the 5.11 kernel and lock assertion failures on 6.1/6.2 kernels. See https://sourceware.org/bugzilla/show_bug.cgi?id=30405

I wonder why you wanted to turn off preemption for the stapkp_init() call in your patch (commit 58c9bd156)?
Comment 5 David Smith 2023-05-08 12:43:40 UTC
(In reply to agentzh from comment #4)
> David, seems like your patch causes CPU lockups with the 5.11 kernel and
> lock assertion failures on 6.1/6.2 kernels. See
> https://sourceware.org/bugzilla/show_bug.cgi?id=30405
> 
> I wonder why you wanted to turn off preemption for the stapkp_init() call in
> your patch (commit 58c9bd156)?

Since this patch was committed almost 7 years ago, I really don't have any memories of it other than what is written here in this bug (see comment #1).

Sorry I can't be of more help.