This is the mail archive of the
systemtap@sourceware.org
mailing list for the systemtap project.
Re: Rapidly running systemtap causing hangs or oops
On Wed, Jun 22, 2011 at 04:52:08PM -0700, Josh Stone wrote:
> On 06/22/2011 04:00 PM, Richard W.M. Jones wrote:
> > Me again. I can get something involving systemtap, ext2, the loop
> > device, Linux 3.0 to oops very easily. I'm not quite sure exactly
> > what factor causes it, but here's an easy reproducer:
> >
> > $ mkdir /tmp/mnt
> >
> > $ truncate -s 1G /tmp/fs
> > $ mkfs.ext2 -F /tmp/fs
> >
> > $ cat > /tmp/test.sh
> > #!/bin/sh -
> > echo mount
> > mount -o loop /tmp/fs /tmp/mnt
> > echo unmount
> > umount /tmp/mnt
> >
> > $ chmod +x /tmp/test.sh
> >
> > $ while sudo stap -e 'probe module("ext2").statement ("*@*.c:*") { printf ("%s\n", pp()); }' -c /tmp/test.sh ; do : ; done
> >
> > The final command usually either hangs the machine, or produces a long
> > oops like the one attached, after just a few iterations. It takes
> > just a few seconds on my VM to get a hang or oops.
>
> Can you try running stap with "-D STP_ALIBI"? This alibi mode compiles
> out most of stap's code, so each probe handler is reduced to just an
> atomic increment, then a final hit count is reported on exit.
Adding -D STP_ALIBI to the command line changed the stap output
a little bit, so I see lines that look like this:
module("ext2").statement("ext2_xattr_put_super@fs/ext2/xattr.c:817"), (<input>:1:1), hits: 1, from: module("ext2").statement("*@*.c:*")
However it did not change the behaviour. The mount process crashed
quickly with the oops below:
[ 159.454020] [<ffffffffa00d0a3b>] ext2_fill_super+0x9b5/0xc3b [ext2]
[ 159.454020] [<ffffffff8113a0df>] mount_bdev+0x155/0x1b7
[ 159.454020] [<ffffffffa00d0086>] ? ext2_error+0x112/0x112 [ext2]
[ 159.454020] [<ffffffffa00cedb5>] ext2_mount+0x15/0x17 [ext2]
[ 159.454020] [<ffffffff8113a844>] mount_fs+0x69/0x155
[ 159.454020] [<ffffffff81103f94>] ? __alloc_percpu+0x10/0x12
[ 159.454020] [<ffffffff8114f902>] vfs_kern_mount+0x63/0xa0
[ 159.454020] [<ffffffff811505d6>] do_kern_mount+0x4d/0xdf
[ 159.454020] [<ffffffff81151c6c>] do_mount+0x63c/0x69f
[ 159.454020] [<ffffffff810ffbd9>] ? memdup_user+0x42/0x6a
[ 159.454020] [<ffffffff810ffc3c>] ? strndup_user+0x3b/0x51
[ 159.454020] [<ffffffff81151f50>] sys_mount+0x88/0xc2
[ 159.454020] [<ffffffff814fa142>] system_call_fastpath+0x16/0x1b
> Another test might be to move the loop inside test.sh, so stap is left
> running the whole time, and we might tell if the issue is timed around
> stap's probe registration or unregistration.
I've left this running for 10 minutes, no crash.
Unfortunately for the real program I'm writing, I really do need a way
to box stap around each test. The problem I was having before was
that there was quite a long delay between my test running and stap
probes firing (or at least, seeing stap output). I need the stap
output from one test to be clearly distinct from the stap output from
the next test. If there was a way to run the test and then say to
stap "now flush all your output" before running the next test, then
that would be acceptable.
I thought about using the process ID, but ideally my tests will all
run as the same pid.
> > [ 342.037017] [<ffffffff8100b0ce>] show_registers+0xbd/0x206
> > [ 342.037017] [<ffffffff814f6cba>] ? atomic_notifier_call_chain+0x14/0x16
> > [ 342.037017] [<ffffffff814f4941>] __die+0x97/0xd8
> > [ 342.037017] [<ffffffff8100be1c>] die+0x47/0x63
> > [ 342.037017] [<ffffffff81009d79>] do_double_fault+0x65/0x67
> > [ 342.037017] [<ffffffff814fb1aa>] double_fault+0x2a/0x30
> > [ 342.037017] [<ffffffffa00ca6a6>] ? ext2_get_inode+0x6d/0x130 [ext2]
>
> Is the Oops always this minimal? Does it always (questionably) point to
> the same ext2_get_inode location?
Different places, but all within the ext2 mounting / superblock code.
> I'll play with this tomorrow and see if I can reproduce it myself...
Thanks for looking at this.
Rich.
--
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
virt-top is 'top' for virtual machines. Tiny program with many
powerful monitoring features, net stats, disk stats, logging, etc.
http://et.redhat.com/~rjones/virt-top