Bug 4052 - frysk2130/strace-clone-exec test causes kernel BUG report and process hang
Summary: frysk2130/strace-clone-exec test causes kernel BUG report and process hang
Status: NEW
Alias: None
Product: frysk
Classification: Unclassified
Component: general (show other bugs)
Version: unspecified
: P2 normal
Target Milestone: ---
Assignee: Unassigned
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2007-02-15 16:12 UTC by Kris Van Hees
Modified: 2019-11-15 15:28 UTC (History)
2 users (show)

See Also:
Host:
Target:
Build:
Last reconfirmed:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Kris Van Hees 2007-02-15 16:12:49 UTC
FC6 with 2.6.19-1.2911.fc6 kernel on x86_64:

BUGging on (tsk->exit_state != EXIT_ZOMBIE)
----------- [cut here ] --------- [please bite here ] ---------
Kernel BUG at kernel/utrace.c:193
invalid opcode: 0000 [1] SMP
last sysfs file: /devices/pci0000:00/0000:00:00.0/irq
CPU 1
Modules linked in: i915 drm autofs4 hidp rfcomm l2cap bluetooth sunrpc dm_mirror
dm_multipath dm_mod video sbs i2c_ec button battery asus_acpi ac ipv6 lp joydev
snd_intel8x0 snd_ac97_codec snd_ac97_bus snd_seq_dummy snd_seq_oss
snd_seq_midi_event snd_seq snd_seq_device snd_pcm_oss sg snd_mixer_oss snd_pcm
snd_timer snd soundcore pcspkr snd_page_alloc tg3 ide_cd shpchp iTCO_wdt
i2c_i801 serio_raw i2c_core parport_pc cdrom parport ata_piix libata sd_mod
scsi_mod ext3 jbd ehci_hcd ohci_hcd uhci_hcd
Pid: 25672, comm: threadexec Not tainted 2.6.19-1.2911.fc6 #1
RIP: 0010:[<ffffffff802bb506>]  [<ffffffff802bb506>] check_dead_utrace+0x6c/0x146
RSP: 0018:ffff8100b3bb7cf8  EFLAGS: 00010286
RAX: 000000000000002f RBX: ffff810099b45080 RCX: ffffffff80575598
RDX: ffffffff80575598 RSI: 0000000000000000 RDI: ffffffff80575580
RBP: 0000000000000000 R08: ffffffff80575598 R09: 0000000000000001
R10: 0000000000000000 R11: ffffffff80277353 R12: ffff8100ae8dd440
R13: 0000000000000000 R14: ffff810099b45080 R15: 0000000000000028
FS:  0000000000000000(0000) GS:ffff810037e25bc0(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 0000000000412460 CR3: 000000009f9c4000 CR4: 00000000000006e0
Process threadexec (pid: 25672, threadinfo ffff8100b3bb6000, task ffff810099b45080)
Stack:  ffff810099b45080 ffff8100ae8dd450 ffff8100ae8dd440 0000000000000000
 0000000000000000 ffffffff802bb653 ffff8100ce544840 0000000000000010
 ffff8100ae8dd440 ffff810099b45080 ffff810099b451a8 ffff8100b3bb7ef8
Call Trace:
 [<ffffffff802bb653>] remove_detached+0x73/0x86
 [<ffffffff80215854>] do_exit+0x817/0x8f7
 [<ffffffff802475d9>] cpuset_exit+0x0/0x6c
 [<ffffffff8022b508>] get_signal_to_deliver+0x3b6/0x3e5
 [<ffffffff802596b4>] do_notify_resume+0x9c/0x727
 [<ffffffff8025c346>] int_signal+0x12/0x17
 [<00000034d2007585>]


Code: 0f 0b 68 7e 68 49 80 c2 c1 00 83 bb f4 00 00 00 ff 75 39 b8
RIP  [<ffffffff802bb506>] check_dead_utrace+0x6c/0x146
 RSP <ffff8100b3bb7cf8>
 <1>Fixing recursive fault but reboot is needed!
BUG: soft lockup detected on CPU#0!

Call Trace:
 [<ffffffff8026999a>] show_trace+0x34/0x47
 [<ffffffff802699bf>] dump_stack+0x12/0x17
 [<ffffffff802b6ced>] softlockup_tick+0xdb/0xf6
 [<ffffffff80293c2f>] update_process_times+0x42/0x68
 [<ffffffff802749d9>] smp_local_timer_interrupt+0x34/0x55
 [<ffffffff8027508d>] smp_apic_timer_interrupt+0x51/0x69
 [<ffffffff8025ccf6>] apic_timer_interrupt+0x66/0x70
 [<ffffffff80207808>] _raw_spin_lock+0x78/0xe5
 [<ffffffff802bb443>] utrace_release_task+0x43/0x78
 [<ffffffff80217ad2>] release_task+0x1c/0x345
 [<ffffffff8022c197>] flush_old_exec+0x361/0xa22
 [<ffffffff8021838a>] load_elf_binary+0x454/0x17a1
 [<ffffffff8023f340>] search_binary_handler+0xc7/0x2b2
 [<ffffffff8023e86c>] do_execve+0x18c/0x242
 [<ffffffff80252be7>] sys_execve+0x36/0x4c
 [<ffffffff8025c4f7>] stub_execve+0x67/0xb0
 [<ffff810037e25bc0>]
DWARF2 unwinder stuck at 0xffff810037e25bc0

Leftover inexact backtrace:


BUG: spinlock lockup on CPU#0, threadexec/25672, ffff8100ae8dd460 (Not tainted)

Call Trace:
 [<ffffffff8026999a>] show_trace+0x34/0x47
 [<ffffffff802699bf>] dump_stack+0x12/0x17
 [<ffffffff80207854>] _raw_spin_lock+0xc4/0xe5
 [<ffffffff802bb443>] utrace_release_task+0x43/0x78
 [<ffffffff80217ad2>] release_task+0x1c/0x345
 [<ffffffff8022c197>] flush_old_exec+0x361/0xa22
 [<ffffffff8021838a>] load_elf_binary+0x454/0x17a1
 [<ffffffff8023f340>] search_binary_handler+0xc7/0x2b2
 [<ffffffff8023e86c>] do_execve+0x18c/0x242
 [<ffffffff80252be7>] sys_execve+0x36/0x4c
 [<ffffffff8025c4f7>] stub_execve+0x67/0xb0

The test execution is killable, yet the threadexec process is not.  In fact,
according to top it continuously consumes 100% CPU, requiring a reboot to clear
up the situation.
Comment 1 Kris Van Hees 2007-02-15 16:37:24 UTC
New occurrence, this time without a CPU lockup requiring a power cycle on the box:

BUGging on (tsk->exit_state != EXIT_ZOMBIE)
----------- [cut here ] --------- [please bite here ] ---------
Kernel BUG at kernel/utrace.c:193
invalid opcode: 0000 [1] SMP 
last sysfs file: /devices/pci0000:00/0000:00:00.0/irq
CPU 0 
Modules linked in: i915 drm autofs4 hidp rfcomm l2cap bluetooth sunrpc dm_mirror
dm_multipath dm_mod video sbs i2c_ec button battery asus_acpi ac ipv6 lp joydev
sg snd_intel8x0 snd_ac97_codec snd_ac97_bus snd_seq_dummy snd_seq_oss
snd_seq_midi_event snd_seq snd_seq_device snd_pcm_oss snd_mixer_oss snd_pcm
ide_cd cdrom snd_timer snd soundcore snd_page_alloc i2c_i801 parport_pc i2c_core
tg3 parport pcspkr iTCO_wdt shpchp serio_raw ata_piix libata sd_mod scsi_mod
ext3 jbd ehci_hcd ohci_hcd uhci_hcd
Pid: 4830, comm: threadexec Not tainted 2.6.19-1.2911.fc6 #1
RIP: 0010:[<ffffffff802bb506>]  [<ffffffff802bb506>] check_dead_utrace+0x6c/0x146
RSP: 0018:ffff8100b6b53cf8  EFLAGS: 00010286
RAX: 000000000000002f RBX: ffff8100d8694040 RCX: ffffffff80575598
RDX: ffffffff80575598 RSI: 0000000000000000 RDI: ffffffff80575580
RBP: 0000000000000000 R08: ffffffff80575598 R09: 0000000000000001
R10: 0000000000000000 R11: ffffffff80277353 R12: ffff8100b6c71500
R13: 0000000000000000 R14: ffff8100d8694040 R15: 0000000000000028
FS:  0000000000000000(0000) GS:ffffffff8064e000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 0000000000412460 CR3: 00000000b8b79000 CR4: 00000000000006e0
Process threadexec (pid: 4830, threadinfo ffff8100b6b52000, task ffff8100d8694040)
Stack:  ffff8100d8694040 ffff8100b6c71510 ffff8100b6c71500 0000000000000000
 0000000000000000 ffffffff802bb653 ffff8100d93d4840 0000000000000010
 ffff8100b6c71500 ffff8100d8694040 ffff8100d8694168 ffff8100b6b53ef8
Call Trace:
 [<ffffffff802bb653>] remove_detached+0x73/0x86
 [<ffffffff80215854>] do_exit+0x817/0x8f7
 [<ffffffff802475d9>] cpuset_exit+0x0/0x6c
 [<ffffffff8022b508>] get_signal_to_deliver+0x3b6/0x3e5
 [<ffffffff802596b4>] do_notify_resume+0x9c/0x727
 [<ffffffff8025c708>] retint_signal+0x3d/0x85


Code: 0f 0b 68 7e 68 49 80 c2 c1 00 83 bb f4 00 00 00 ff 75 39 b8 
RIP  [<ffffffff802bb506>] check_dead_utrace+0x6c/0x146
 RSP <ffff8100b6b53cf8>
 <1>Fixing recursive fault but reboot is needed!
Comment 2 Kris Van Hees 2007-03-07 16:27:13 UTC
This problem is still occuring on FC6 with the 2.6.19-1.2911.6.5.fc6 kernel.  It
has been confirmed to reoccur on x86.  So far, x86_64 has not shown this problem
again.
Comment 3 Kris Van Hees 2007-03-09 23:03:37 UTC
Confirmed to occur on x86 and x86_64 FC6 with 2.6.19-1.2911.6.5.fc6 kernel.

I am adding code tyo the test script to skip running the test on FC6 kernels
known to cause a problem, because it is causes system hangs, often requiring
power cycling the machine.
Comment 4 bradley2844maximus@gmx.com 2019-11-15 15:28:53 UTC Comment hidden (spam)