Bug 14348

Summary:	Systemtap systemtap.base/buildid.exp hangs on F17 machines
Product:	systemtap	Reporter:	William Cohen <wcohen>
Component:	runtime	Assignee:	Unassigned <systemtap>
Status:	RESOLVED INVALID
Severity:	critical	CC:	dsmith, fche
Priority:	P2
Version:	unspecified
Target Milestone:	---
See Also:	https://bugzilla.redhat.com/show_bug.cgi?id=843640
Host:		Target:
Build:		Last reconfirmed:
Attachments:	sysrq-t output when machine is hung with systemtap-1.8-4 rpm sysrq-l output for buildid.exp hang on fedora 17 soft lockup info from console patching out test that hangs allows buildid.exp to complete Pared down buildid2.exp that replicates the problem excerpt from /var/log/messages from a debug kernel An attempt to replicate the problem using perf Replicator for uprobes issue

Description William Cohen 2012-07-10 21:23:25 UTC

When trying out the Fedora 17 systemtap-1.8 rpms on x86_64 machine the systemtap.base/buildid.exp test would hang and the cpu use would be pegged.  This has been observed both on a physical f17 machine and virtual image.

kernel-3.4.4-5.fc17.x86_64
systemtap-1.8-1.fc17.x86_64
systemtap-testsuite-1.8-1.fc17.x86_64

To reproduce.

As root:

cd /usr/share/systemtap/testsuite/
make installcheck RUNTESTFLAGS="--debug buildid.exp"

The test will start running but never finish. Will also see the cpu load be pegged on the machine. 

Note that this problem has not been observed in the upstream git verions of systemtap. It is suspected that that the following patch is fixing the problem

http://sourceware.org/git/gitweb.cgi?p=systemtap.git;a=commit;h=50bd2c22a62981ae8b7b379d48574671228f1446

Comment 1 William Cohen 2012-07-17 16:14:07 UTC

Created attachment 6533 [details]
sysrq-t output when machine is hung with systemtap-1.8-4 rpm

This is the sysrq-t output of what is running on the system when the machine gets hung with:

make installcheck RUNTESTFLAGS="systemtap.base/buildid.exp"

It looks like buildid.x pid 1645 is getting stuck running.

Comment 2 William Cohen 2012-07-17 16:22:53 UTC

Created attachment 6534 [details]
sysrq-l output for buildid.exp hang on fedora 17

Backtrace of the problem process.

Comment 3 Frank Ch. Eigler 2012-07-17 19:18:24 UTC

From that last attachment:

[  283.666010]  <EOI> 
[  283.666010]  [<ffffffff812c6b48>] ? __rb_rotate_right+0x8/0x90
[  283.666010]  [<ffffffff812c6ce0>] ? rb_insert_color+0x110/0x150
[  283.666010]  [<ffffffff8114b140>] __vma_link_rb+0x30/0x40
[  283.666010]  [<ffffffff8114b1c8>] vma_link+0x78/0xe0
[  283.666010]  [<ffffffff8114ce2d>] mmap_region+0x56d/0x630
[  283.666010]  [<ffffffff8114d23d>] do_mmap_pgoff+0x34d/0x3a0
[  283.666010]  [<ffffffff8114d4ed>] do_mmap+0x2d/0x40
[  283.666010]  [<ffffffff815ed6aa>] elf_map+0x119/0x14a
[  283.666010]  [<ffffffff811d04a9>] load_elf_binary+0x5e9/0x18f0
[  283.666010]  [<ffffffff81186eb5>] ? copy_strings.isra.24+0x2b5/0x300
[  283.666010]  [<ffffffff811cfec0>] ? notesize.isra.6+0x30/0x30
[  283.666010]  [<ffffffff811871f7>] search_binary_handler+0x117/0x320
[  283.666010]  [<ffffffff81188734>] do_execve_common.isra.28+0x434/0x500
[  283.666010]  [<ffffffff8118881b>] do_execve+0x1b/0x20
[  283.666010]  [<ffffffff8101c3c7>] sys_execve+0x47/0x70
[  283.666010]  [<ffffffff815fcdec>] stub_execve+0x6c/0xc0

This looks very familiar; it resembles the kinds of failures I was wrestling with just before releasing 1.8, when testing linux-next kernels.  The apparent hang appeared to be corruption of the rbtree data structure, to give it loops where there ought be none.

Comment 4 William Cohen 2012-07-19 14:11:58 UTC

The mainline linux kernel (3.5.0-rc7+) seems to suffer from the same problem.

Comment 5 William Cohen 2012-07-23 19:11:49 UTC

Created attachment 6549 [details]
soft lockup info from console

On linux-3.5.0+ kernel see the same problem.  This is the output from the console showing a soft lockup which looks similar to the previous backtraces.

Comment 6 William Cohen 2012-07-23 21:46:29 UTC

Created attachment 6550 [details]
patching out test that hangs allows buildid.exp to complete

This patch comments out a one of the buildid.exp tests. When this one test is commented out the buildid.exp runs and completes.

Comment 7 William Cohen 2012-07-24 14:49:52 UTC

Created attachment 6553 [details]
Pared down buildid2.exp that replicates the problem

This example placed in the /usr/share/systemtap/testsuite/systemtap.base consistently triggers the hange on fedora 17 with:

 make installcheck RUNTESTFLAGS="--debug --verbose systemtap.base/buildid2.exp"


Note that it in buildid.exp a similar stap test that passes follows the problem test:

spawn stap -m bid -p4 -DSTP_NO_BUILDID_CHECK $bid_stp $bid_exepath
catch {close}; catch {wait}
spawn staprun [pwd]/bid.ko -c $bid_exepath

Comment 8 David Smith 2012-07-24 21:48:43 UTC

Created attachment 6554 [details]
excerpt from /var/log/messages from a debug kernel

With the debug kernel (3.4.6-2.fc17.x86_64.debug), I'm seeing a gpf that might be a bit earlier in the timeline.  Here's the call trace:

====
Jul 24 16:06:55 kvm-f17-64-1 kernel: [ 1245.120023] Call Trace:
Jul 24 16:06:55 kvm-f17-64-1 kernel: [ 1245.120023]  [<ffffffff812c8beb>] ? file_map_prot_check+0x7b/0xf0
Jul 24 16:06:55 kvm-f17-64-1 kernel: [ 1245.120023]  [<ffffffff81182a3d>] do_mmap_pgoff+0x35d/0x3b0
Jul 24 16:06:55 kvm-f17-64-1 kernel: [ 1245.120023]  [<ffffffff81182cfd>] do_mmap+0x2d/0x40
Jul 24 16:06:55 kvm-f17-64-1 kernel: [ 1245.120023]  [<ffffffff816a5099>] elf_map+0x11c/0x150
Jul 24 16:06:55 kvm-f17-64-1 kernel: [ 1245.120023]  [<ffffffff812183a9>] load_elf_binary+0x5e9/0x1900
Jul 24 16:06:55 kvm-f17-64-1 kernel: [ 1245.120023]  [<ffffffff810a4825>] ? sched_clock_local+0x25/0xa0
Jul 24 16:06:55 kvm-f17-64-1 kernel: [ 1245.120023]  [<ffffffff81217dc0>] ? padzero+0x40/0x40
Jul 24 16:06:55 kvm-f17-64-1 kernel: [ 1245.120023]  [<ffffffff81217dc0>] ? padzero+0x40/0x40
Jul 24 16:06:55 kvm-f17-64-1 kernel: [ 1245.120023]  [<ffffffff811c7a44>] search_binary_handler+0x194/0x4f0
Jul 24 16:06:55 kvm-f17-64-1 kernel: [ 1245.120023]  [<ffffffff811c7911>] ? search_binary_handler+0x61/0x4f0
Jul 24 16:06:55 kvm-f17-64-1 kernel: [ 1245.120023]  [<ffffffff811c885b>] do_execve_common.isra.30+0x4eb/0x590
Jul 24 16:06:55 kvm-f17-64-1 kernel: [ 1245.120023]  [<ffffffff811c847d>] ? do_execve_common.isra.30+0x10d/0x590
Jul 24 16:06:55 kvm-f17-64-1 kernel: [ 1245.120023]  [<ffffffff811cbedb>] ? getname_flags+0x3b/0x250
Jul 24 16:06:55 kvm-f17-64-1 kernel: [ 1245.120023]  [<ffffffff811c891b>] do_execve+0x1b/0x20
Jul 24 16:06:55 kvm-f17-64-1 kernel: [ 1245.120023]  [<ffffffff81023377>] sys_execve+0x47/0x70
Jul 24 16:06:55 kvm-f17-64-1 kernel: [ 1245.120023]  [<ffffffff816b792c>] stub_execve+0x6c/0xc0
====

Comment 9 David Smith 2012-07-26 16:22:19 UTC

Here's some more info on this bug.

The buildid2.exp testcase modifies the buildid.x executable.  Here's the original executable:

===
# eu-readelf -n ./buildid.x
Note section [ 2] '.note.ABI-tag' of 32 bytes at offset 0x21c:
  Owner          Data size  Type
  GNU                   16  VERSION
    OS: Linux, ABI: 2.6.32

Note section [ 3] '.note.gnu.build-id' of 36 bytes at offset 0x23c:
  Owner          Data size  Type
  GNU                   20  GNU_BUILD_ID
    Build ID: c62eb8e35b2b8729e42291c813235bc94a20ab68
===

After the testcase uses objcopy to modify it:

===
# eu-readelf -n ./buildid.x

Note section [ 2] '.note.ABI-tag' of 32 bytes at offset 0x21c:
  Owner          Data size  Type
  GNU                   16  VERSION
    OS: Linux, ABI: 2.6.32

Note section [ 3] '.note.gnu.build-id' of 73 bytes at offset 0x23c:
  Owner          Data size  Type
  GNU                   20  GNU_BUILD_ID
    Build ID: f9e1c2e8b7ddf2ee5b73eb58f03654be99c945c2
  GNU                   20  GNU_BUILD_ID
    Build ID: 57eec9e035f81d85c685c88f31d10f31f6768b8c
eu-readelf: cannot get content of note section: offset out of range
===

So, that's a bit scary - the fact that the offset is out of range and the fact that the .note.gnu.build-id section has doubled in size.

When stap is run on the original executable, we get:

===
# stap -p2 ../../src/testsuite/systemtap.base/buildid.stp ./buildid.x
# functions
_dwarf_tvar_get_x_0:long ()
# probes
process("/es/scratch/dsmith/systemtap/f17-64/testsuite/1/buildid.x").function("foo@../../src/testsuite/systemtap.base/buildid.c:5") /* pc=.absolute+0x547 */ /* <- process("/es/scratch/dsmith/systemtap/f17-64/testsuite/1/buildid.x").function("foo@../../src/testsuite/systemtap.base/buildid.c:5").label("label") = process("/es/scratch/dsmith/systemtap/f17-64/testsuite/1/buildid.x").function("foo@../../src/testsuite/systemtap.base/buildid.c:5") <- process("./buildid.x").function("foo").label("label") */
===

So, the probe is at 0x547.

When run on the modified executable, we get:

===
stap -p2 ../../src/testsuite/systemtap.base/buildid.stp ./buildid.x
# functions
_dwarf_tvar_get_x_0:long ()
# probes
process("/es/scratch/dsmith/systemtap/f17-64/testsuite/buildid.x").function("foo@../../src/testsuite/systemtap.base/buildid.c:5") /* pc=.absolute+0x567 */ /* <- process("/es/scratch/dsmith/systemtap/f17-64/testsuite/buildid.x").function("foo@../../src/testsuite/systemtap.base/buildid.c:5").label("label") = process("/es/scratch/dsmith/systemtap/f17-64/testsuite/buildid.x").function("foo@../../src/testsuite/systemtap.base/buildid.c:5") <- process("./buildid.x").function("foo").label("label") */
===

On the new executable, the probe should be at 0x567, 0x20 bytes past the original address.

The hang here might be related to how inode-uprobes work.  With the old utrace-based uprobes, when we find the target process we check the buildid, then install the probe(s).  With the new inode-based uprobes, we actually install the probe(s) at module startup, then when we find the target process we check the buildid (if I'm understanding the code correctly).

I'm wondering if the bad address we're installing the probe at is causing this hang (possibly the bad probe location isn't on an instruction boundary for instance).

Comment 10 William Cohen 2012-07-26 17:18:23 UTC

Created attachment 6558 [details]
An attempt to replicate the problem using perf

This script is an attempt to replicate the problem using perf rather than systemtap.  However, perf doesn't cause the machine to hang.  It gets one sample for the first "perf record", but it doesn't get any samples for the second "perf record ..."

Comment 11 William Cohen 2012-07-26 20:01:27 UTC

Created attachment 6559 [details]
Replicator for uprobes issue

This tarball contains everything to replicate the problem on fedor17 on x86_64. The addresses might be different on i686, so this might not fail (or work) in the same manner as x86_64.

Unpack the tarball

tar xvfz pr14348.tar.gz; cd pr14348

Run the pr14348_perf.sh as root. This demonstrates the problem using the raw perf infrastructure.

Running the pr14348.sh as root demonstrates the problem using systemtap-1.8-4.

Comment 12 Frank Ch. Eigler 2012-07-30 15:15:05 UTC

Found to be kernel bug, corrected by https://lkml.org/lkml/2012/7/28/59

Comment 13 David Smith 2012-08-02 15:59:08 UTC

The systemtap side of this problem, bug #14409 (Delay inode-uprobes registration until after buildid checks) has been fixed.