This is the mail archive of the systemtap@sourceware.org mailing list for the systemtap project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: Rapidly running systemtap causing hangs or oops


On Wed, Jun 22, 2011 at 04:52:08PM -0700, Josh Stone wrote:
> On 06/22/2011 04:00 PM, Richard W.M. Jones wrote:
> > Me again.  I can get something involving systemtap, ext2, the loop
> > device, Linux 3.0 to oops very easily.  I'm not quite sure exactly
> > what factor causes it, but here's an easy reproducer:
> > 
> > $ mkdir /tmp/mnt
> > 
> > $ truncate -s 1G /tmp/fs
> > $ mkfs.ext2 -F /tmp/fs
> > 
> > $ cat > /tmp/test.sh 
> > #!/bin/sh -
> > echo mount
> > mount -o loop /tmp/fs /tmp/mnt
> > echo unmount
> > umount /tmp/mnt
> > 
> > $ chmod +x /tmp/test.sh
> > 
> > $ while sudo stap -e 'probe module("ext2").statement ("*@*.c:*") { printf ("%s\n", pp()); }' -c /tmp/test.sh ; do : ; done
> > 
> > The final command usually either hangs the machine, or produces a long
> > oops like the one attached, after just a few iterations.  It takes
> > just a few seconds on my VM to get a hang or oops.
> 
> Can you try running stap with "-D STP_ALIBI"?  This alibi mode compiles
> out most of stap's code, so each probe handler is reduced to just an
> atomic increment, then a final hit count is reported on exit.

Adding -D STP_ALIBI to the command line changed the stap output
a little bit, so I see lines that look like this:

module("ext2").statement("ext2_xattr_put_super@fs/ext2/xattr.c:817"), (<input>:1:1), hits: 1, from: module("ext2").statement("*@*.c:*")

However it did not change the behaviour.  The mount process crashed
quickly with the oops below:

[  159.454020]  [<ffffffffa00d0a3b>] ext2_fill_super+0x9b5/0xc3b [ext2]
[  159.454020]  [<ffffffff8113a0df>] mount_bdev+0x155/0x1b7
[  159.454020]  [<ffffffffa00d0086>] ? ext2_error+0x112/0x112 [ext2]
[  159.454020]  [<ffffffffa00cedb5>] ext2_mount+0x15/0x17 [ext2]
[  159.454020]  [<ffffffff8113a844>] mount_fs+0x69/0x155
[  159.454020]  [<ffffffff81103f94>] ? __alloc_percpu+0x10/0x12
[  159.454020]  [<ffffffff8114f902>] vfs_kern_mount+0x63/0xa0
[  159.454020]  [<ffffffff811505d6>] do_kern_mount+0x4d/0xdf
[  159.454020]  [<ffffffff81151c6c>] do_mount+0x63c/0x69f
[  159.454020]  [<ffffffff810ffbd9>] ? memdup_user+0x42/0x6a
[  159.454020]  [<ffffffff810ffc3c>] ? strndup_user+0x3b/0x51
[  159.454020]  [<ffffffff81151f50>] sys_mount+0x88/0xc2
[  159.454020]  [<ffffffff814fa142>] system_call_fastpath+0x16/0x1b

> Another test might be to move the loop inside test.sh, so stap is left
> running the whole time, and we might tell if the issue is timed around
> stap's probe registration or unregistration.

I've left this running for 10 minutes, no crash.

Unfortunately for the real program I'm writing, I really do need a way
to box stap around each test.  The problem I was having before was
that there was quite a long delay between my test running and stap
probes firing (or at least, seeing stap output).  I need the stap
output from one test to be clearly distinct from the stap output from
the next test.  If there was a way to run the test and then say to
stap "now flush all your output" before running the next test, then
that would be acceptable.

I thought about using the process ID, but ideally my tests will all
run as the same pid.

> > [  342.037017]  [<ffffffff8100b0ce>] show_registers+0xbd/0x206
> > [  342.037017]  [<ffffffff814f6cba>] ? atomic_notifier_call_chain+0x14/0x16
> > [  342.037017]  [<ffffffff814f4941>] __die+0x97/0xd8
> > [  342.037017]  [<ffffffff8100be1c>] die+0x47/0x63
> > [  342.037017]  [<ffffffff81009d79>] do_double_fault+0x65/0x67
> > [  342.037017]  [<ffffffff814fb1aa>] double_fault+0x2a/0x30
> > [  342.037017]  [<ffffffffa00ca6a6>] ? ext2_get_inode+0x6d/0x130 [ext2]
> 
> Is the Oops always this minimal?  Does it always (questionably) point to
> the same ext2_get_inode location?

Different places, but all within the ext2 mounting / superblock code.

> I'll play with this tomorrow and see if I can reproduce it myself...

Thanks for looking at this.

Rich.

-- 
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
virt-top is 'top' for virtual machines.  Tiny program with many
powerful monitoring features, net stats, disk stats, logging, etc.
http://et.redhat.com/~rjones/virt-top


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]