Bug 3820 - kernel BUG at kernel/utrace.c:201! running make check in frysk-imports
Summary: kernel BUG at kernel/utrace.c:201! running make check in frysk-imports
Status: SUSPENDED
Alias: None
Product: frysk
Classification: Unclassified
Component: general (show other bugs)
Version: unspecified
: P2 normal
Target Milestone: ---
Assignee: Unassigned
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2007-01-02 09:52 UTC by Mark Wielaard
Modified: 2011-03-16 21:19 UTC (History)
0 users

See Also:
Host:
Target:
Build:
Last reconfirmed:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Mark Wielaard 2007-01-02 09:52:51 UTC
Running make check in frysk-imports generates the following kernel message on my
FC6 machine (Linux hermans.wildebeest.org 2.6.18-1.2868.fc6 #1 SMP Fri Dec 15
17:32:54 EST 2006 i686 i686 i386 GNU/Linux):

Jan  2 10:43:48 hermans kernel: ------------[ cut here ]------------
Jan  2 10:43:48 hermans kernel: kernel BUG at kernel/utrace.c:201!
Jan  2 10:43:48 hermans kernel: invalid opcode: 0000 [#1]
Jan  2 10:43:48 hermans kernel: SMP
Jan  2 10:43:48 hermans kernel: last sysfs file:
/class/net/sit0/statistics/collisions
Jan  2 10:43:48 hermans kernel: Modules linked in: i915 drm autofs4
cpufreq_ondemand video sbs ibm_acpi i2c_ec dock button battery asus_acpi ac ipv6
parport_pc
lp parport snd_hda_intel snd_hda_codec joydev snd_seq_dummy snd_seq_oss
snd_seq_midi_event snd_seq snd_seq_device sg snd_pcm_oss snd_mixer_oss snd_pcm
snd_timer e1000 i2c_i801 serio_raw snd i2c_core ide_cd pcspkr cdrom soundcore
snd_page_alloc dm_snapshot dm_zero dm_mirror dm_mod ahci libata sd_mod scsi_mod
ext3 jbd ehci_hcd ohci_hcd uhci_hcd
Jan  2 10:43:48 hermans kernel: CPU:    1
Jan  2 10:43:48 hermans kernel: EIP:    0060:[<c045164b>]    Not tainted VLI
Jan  2 10:43:48 hermans kernel: EFLAGS: 00210202   (2.6.18-1.2868.fc6 #1)
Jan  2 10:43:48 hermans kernel: EIP is at check_dead_utrace+0x3b/0xe0
Jan  2 10:43:48 hermans kernel: eax: 00000020   ebx: f733f200   ecx: 00000000
edx: 00000008
Jan  2 10:43:48 hermans kernel: esi: f16d3440   edi: 00000000   ebp: f16d3440
esp: f1a6fe14
Jan  2 10:43:48 hermans kernel: ds: 007b   es: 007b   ss: 0068
Jan  2 10:43:48 hermans kernel: Process threadexec (pid: 2956, ti=f1a6f000
task=f733f200 task.ti=f1a6f000)
Jan  2 10:43:48 hermans kernel: Stack: f16d3448 f16d3448 00000000 f16d3440
c0451741 f1a6fe5c f733f200 f16d3448
Jan  2 10:43:48 hermans kernel:        00000020 00000020 f733f200 c0452817
00000020 00000000 00200246 f1b21878
Jan  2 10:43:48 hermans kernel:        f1b21840 f16d2440 f16d3440 00000010
f16d3440 f733f200 f6c283f0 c0427d19
Jan  2 10:43:48 hermans kernel: Call Trace:
Jan  2 10:43:48 hermans kernel:  [<c0451741>] remove_detached+0x51/0x74
Jan  2 10:43:48 hermans kernel:  [<c0452817>] utrace_report_death+0x1d0/0x205
Jan  2 10:43:48 hermans kernel:  [<c0427d19>] do_exit+0x6fc/0x776
Jan  2 10:43:48 hermans kernel:  [<c0427e09>] sys_exit_group+0x0/0xd
Jan  2 10:43:48 hermans kernel:  [<f1a6ff94>] 0xf1a6ff94
Jan  2 10:43:48 hermans kernel: DWARF2 unwinder stuck at 0xf1a6ff94
Jan  2 10:43:48 hermans kernel: Leftover inexact backtrace:
Jan  2 10:43:48 hermans kernel:  [<c061542d>] do_page_fault+0x0/0x4db
Jan  2 10:43:48 hermans kernel:  [<c0430808>] get_signal_to_deliver+0x38b/0x3b3
Jan  2 10:43:48 hermans kernel:  [<c0430005>] specific_send_sig_info+0x8e/0x99
Jan  2 10:43:48 hermans kernel:  [<c061542d>] do_page_fault+0x0/0x4db
Jan  2 10:43:48 hermans kernel:  [<c0403626>] do_notify_resume+0x7e/0x6c1
Jan  2 10:43:48 hermans kernel:  [<c041c9ca>] force_sig_info_fault+0x24/0x28
Jan  2 10:43:48 hermans kernel:  [<c0614616>] _spin_unlock_irq+0x5/0x7
Jan  2 10:43:48 hermans kernel:  [<c0451945>] utrace_quiescent+0xc2/0x1d5
Jan  2 10:43:48 hermans kernel:  [<c0452ed1>] utrace_report_syscall+0x171/0x190
Jan  2 10:43:48 hermans kernel:  [<c0615900>] do_page_fault+0x4d3/0x4db
Jan  2 10:43:48 hermans kernel:  [<c061542d>] do_page_fault+0x0/0x4db
Jan  2 10:43:48 hermans kernel:  [<c04040a2>] work_notifysig+0x13/0x19
Jan  2 10:43:48 hermans kernel:  =======================
Jan  2 10:43:48 hermans kernel: Code: 7a 04 00 8b 93 a4 04 00 00 0f 45 f8 89 f8
83 e2 08 f7 d0 85 d0 74 69 8b 83 90 00 00 00 85 c0 74 5f 83 3e 00 75 5a 83 f8 10
74 08 <0f> 0b c9 00 70 7f 63 c0 c7 06 01 00 00 00 83 bb 98 00 00 00 ff
Jan  2 10:43:48 hermans kernel: EIP: [<c045164b>] check_dead_utrace+0x3b/0xe0
SS:ESP 0068:f1a6fe14
Jan  2 10:43:48 hermans kernel:  <1>Fixing recursive fault but reboot is needed!
Comment 1 Mark Wielaard 2007-01-03 05:00:38 UTC
Also happens with 2.6.18-1.2869.fc6
Comment 2 Mark Wielaard 2007-01-19 11:40:34 UTC
With 2.6.19-1.2895.fc6 kernel gives:
kernel: BUGging on (tsk->exit_state != EXIT_ZOMBIE)
and then seems to hang the machine.
Comment 3 Mark Wielaard 2007-01-19 11:57:34 UTC
A second make check run (after reboot) with 2.6.19-1.2895.fc6 gave the following:

Jan 19 12:53:45 hermans kernel: BUGging on (tsk->exit_state != EXIT_ZOMBIE)
Jan 19 12:53:45 hermans kernel: ------------[ cut here ]------------
Jan 19 12:53:45 hermans kernel: kernel BUG at kernel/utrace.c:193!
Jan 19 12:53:45 hermans kernel: invalid opcode: 0000 [#1]
Jan 19 12:53:45 hermans kernel: SMP
Jan 19 12:53:45 hermans kernel: last sysfs file:
/class/net/eth1/statistics/collisions
Jan 19 12:53:45 hermans kernel: Modules linked in: i915 drm autofs4
cpufreq_ondemand video sbs ibm_acpi i2c_ec dock button battery asus_acpi ac ipv6
parport_pc lp parport joydev snd_hda_intel snd_hda_codec snd_seq_dummy sg
snd_seq_oss snd_seq_midi_event snd_seq snd_seq_device snd_pcm_oss snd_mixer_oss
snd_pcm snd_timer serio_raw ata_piix snd soundcore nsc_ircc e1000 snd_page_alloc
irda crc_ccitt i2c_i801 iTCO_wdt i2c_core ide_cd cdrom pcspkr dm_snapshot
dm_zero dm_mirror dm_mod ahci libata sd_mod scsi_mod ext3 jbd ehci_hcd ohci_hcd
uhci_hcd
Jan 19 12:53:45 hermans kernel: CPU:    0
Jan 19 12:53:45 hermans kernel: EIP:    0060:[<c0456467>]    Not tainted VLI
Jan 19 12:53:45 hermans kernel: EFLAGS: 00210296   (2.6.19-1.2895.fc6 #1)
Jan 19 12:53:45 hermans kernel: EIP is at check_dead_utrace+0x55/0x110
Jan 19 12:53:45 hermans kernel: eax: 0000002f   ebx: de725670   ecx: c0697ed0
edx: 00200082
Jan 19 12:53:45 hermans kernel: esi: 00000000   edi: d02e53a0   ebp: d02e53a0
esp: dd8dfdf8
Jan 19 12:53:45 hermans kernel: ds: 007b   es: 007b   ss: 0068
Jan 19 12:53:45 hermans kernel: Process threadexec (pid: 20057, ti=dd8df000
task=de725670 task.ti=dd8df000)
Jan 19 12:53:45 hermans kernel: Stack: c063e591 c064c238 d02e53a8 00000000
d02e53a8 d02e53a0 c045657c 00000000
Jan 19 12:53:45 hermans kernel:        de725670 d02e53a8 00000020 de725670
d02e53a0 c04578ee 00000000 00000000
Jan 19 12:53:45 hermans kernel:        dabc7778 00200246 de725b18 dabc7740
e4c313a0 d02e53b0 00000010 d02e53a0
Jan 19 12:53:45 hermans kernel: Call Trace:
Jan 19 12:53:45 hermans kernel:  [<c045657c>] remove_detached+0x5a/0x6d
Jan 19 12:53:45 hermans kernel:  [<c04578ee>] utrace_report_death+0x221/0x229
Jan 19 12:53:45 hermans kernel:  [<c042a65a>] do_exit+0x6e1/0x787
Jan 19 12:53:45 hermans kernel:  [<c042a78d>] sys_exit_group+0x0/0xd
Jan 19 12:53:45 hermans kernel:  [<c0433185>] get_signal_to_deliver+0x38b/0x3b3
Jan 19 12:53:45 hermans kernel:  [<c040365b>] do_notify_resume+0x83/0x6c6
Jan 19 12:53:45 hermans kernel:  [<c04040da>] work_notifysig+0x13/0x19
Jan 19 12:53:45 hermans kernel:  =======================
Jan 19 12:53:45 hermans kernel: Code: f0 f7 d0 83 e2 08 85 d0 0f 84 80 00 00 00
85 c9 74 7c 83 f9 10 74 1c c7 44 24 04 38 c2 64 c0 c7 04 24 91 e5 63 c0 e8 6a 16
fd ff <0f> 0b c1 00 0a c2 64 c0 83 bb 98 00 00 00 ff 75 33 b8 20 00 00
Jan 19 12:53:45 hermans kernel: EIP: [<c0456467>] check_dead_utrace+0x55/0x110
SS:ESP 0068:dd8dfdf8
Jan 19 12:53:45 hermans kernel:  <1>Fixing recursive fault but reboot is needed!
Comment 4 Kris Van Hees 2007-02-01 20:31:52 UTC
Seeing this problem as well.  As far as I can determine right now, it is test
frysk2130/strace-clone-exec.sh causing it.  It is mostly reproducible, though I
have been able to do full 'make check' runs on my frysk build that do not
trigger it, occasionally.
Comment 5 Kris Van Hees 2007-02-01 20:40:09 UTC
Quick follow-up - did a reboot and executed just the strace-clone-exec.sh test
from the command line.  This also triggered the kernel bug, on FC6 with the
2.6.19-1.2895.fc6 kernel.

Comment 6 Kris Van Hees 2007-02-01 21:45:56 UTC
Anyone out there with the latest RHEL5 beta installed who could verify whether
this problem occurs there also?
Comment 7 Kris Van Hees 2007-02-03 02:34:39 UTC
This is actual a kernel bug (as stated in the original bug that prompted this
testcase to be added - #2130).  However, note that the testcase is *not* testing
the kernel bug in the sense that the testcase passes happily while it still
often (but not always) triggers the kernel bug.

So, noteworthy info:
- This is definitely a kernel bug
- The test case passes even if the kernel bug is triggered
- The kernel bug is not 100% reproducible - it sometimes takes multiple attempts
Comment 8 Kris Van Hees 2007-02-08 22:27:51 UTC

*** This bug has been marked as a duplicate of 2130 ***
Comment 9 Kris Van Hees 2007-02-09 01:09:38 UTC
Reopening after talking to Mark.  We believe this is not actually the same
problem as 2130, but rather a similar case (occuring in a slightly different place).
Comment 10 Kris Van Hees 2007-02-09 01:21:32 UTC
Opened a bug against the FC6 kernel:

    https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=227952

Suspending this ticket to track progress of the FC bug.