Bug 18143 - target_set tapset does not track threads created with clone()
Summary: target_set tapset does not track threads created with clone()
Status: RESOLVED FIXED
Alias: None
Product: systemtap
Classification: Unclassified
Component: tapsets (show other bugs)
Version: unspecified
: P2 normal
Target Milestone: ---
Assignee: Unassigned
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2015-03-18 20:13 UTC by Max Timchenko
Modified: 2015-03-20 18:01 UTC (History)
2 users (show)

See Also:
Host:
Target:
Build:
Last reconfirmed:


Attachments
tentative patch (490 bytes, text/plain)
2015-03-18 20:13 UTC, Max Timchenko
Details
Proposed fixes (734 bytes, text/plain)
2015-03-20 13:24 UTC, David Smith
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Max Timchenko 2015-03-18 20:13:27 UTC
Created attachment 8197 [details]
tentative patch

I'm using systemtap to track system calls done by a specific process and its descendants (e.g., sshd). Using ancient systemtap, the script works perfectly. Using 2.6/0.159, no child process activity is picked up. After a little debugging, it seems child processes are not being added to the target set.

Example probes:
probe syscall.execve {
    if (target_set_pid(pid())) {
        printf("t=%d p=%d (%s) execve (%s)\n", getjc(), pid(), execname(), argstr)
    }
}

probe syscall.fork.return {
    if (target_set_pid(pid())) {
        printf("t=%d p=%d (%s) fork %s into %s\n", getjc(), pid(), execname(), name, retstr)
    }
}

Expected output (original pid of sshd is 609, spawning threads 7336 and 7339):
$ stap -x 609 example.stp
t=49957 p=7336 (sshd) execve ("/usr/sbin/sshd" "-D" "-R")
t=50033 p=7339 (sshd) execve ("/bin/bash" )

Actual output:
none

Guess at the cause:
Looking at bug 16667 [1]: "The syscall.fork probe alias was broken up into syscall.fork, syscall.vfork, and syscall.clone aliases." The target_set tapset hooks syscall.fork which, reading comment 2 to the bug, tracked fork, fork, and clone calls. But the change that closed the bug separated the three calls, so now the tapset code tracks fork but not vfork, nor clone.

[1] https://sourceware.org/bugzilla/show_bug.cgi?id=16667

A tentative patch is attached.
Comment 1 David Smith 2015-03-19 18:28:01 UTC
Hmm, I'm confused. The following commit (present in stap 2.6) changed target_set.stp to use process.begin and process.end, instead of syscall.fork. So, in theory you shouldn't be seeing the problem you are seeing.

====
commit a7160c2becb70d91da0cf3f7ac08614ee0950f71
Author: Josh Stone <jistone@redhat.com>
Date:   Thu May 16 16:12:02 2013 -0700

    Improve the kernel target_set.stp and add a stapdyn version
    
    * tapset/linux/target_set.stp: Prefer process.begin/end for better
      tracking *process* lifetime. (syscall.exit covers thread exits too!)
    * tapset/dyninst/target_set.stp: Implement stapdyn target_set.
    * testsuite/systemtap.base/target_set_thread.*: Test that thread exits
      don't cause target_set to lose sight of a process.
    * main.cxx (run_sdt_benchmark): Save the timing info for verbose runs.
====

Does your version of stap 2.6 not have this patch?
Comment 2 Josh Stone 2015-03-19 23:48:25 UTC
It only *prefers* those (e.g. process.begin!, ...) but when that fails it still falls on syscall probes.  It may be that Max's kernel doesn't have utrace nor satisfy the requirements for stp_utrace.
Comment 3 David Smith 2015-03-20 13:24:34 UTC
Created attachment 8200 [details]
Proposed fixes

Here's a new version of the tapset (untested) that might fix your problem. It adds in syscall.vfork.return and syscall.clone.return probes.

Can you let me know if it helps?
Comment 4 Max Timchenko 2015-03-20 15:18:49 UTC
Thanks everyone for responses, that was a lot to process. I'll address in order. New theory as to the source of the problem below.

> ...changed target_set.stp to use process.begin...Does your version of stap 2.6 not have this patch?

David: it has the patch. Checked by looking at the tapset. The exact version is (version 2.6/0.159, Debian version 2.6-0.2)

> It may be that Max's kernel doesn't have utrace nor satisfy the requirements for stp_utrace.

Josh: looks like uprobe support is there: I'm running 
"Linux jessie 3.16.0-4-amd64 #1 SMP Debian 3.16.7-ckt7-1 (2015-03-01)", 
$ grep UPRO /boot/config-3.16.0-4-amd64 
CONFIG_ARCH_SUPPORTS_UPROBES=y
CONFIG_UPROBES=y
CONFIG_UPROBE_EVENT=y

New theory: the probe process.begin is silently skipped when running in target thread mode (-x). This might be the reason the target_set tracking probe fails to run even though the patch is present and uprobe support is there, since I always point stap at a particular root process of interest.

Evidence:

testfile1.stp
---
probe process.begin {
	printf("Process begin, pid=%d\n", pid())
}

probe process.thread.begin {
	printf("Thread begin, pid=%d\n", pid())
}
---
$ stap testfile1.stp
Process begin, pid=1
Process begin, pid=560
Process begin, pid=573
...

$ stap -x 609 testfile1.stp
semantic error: while resolving probe point: identifier 'process' at test.stp:1:7
        source: probe process.begin {
                      ^

semantic error: invalid pid

Pass 2: analysis failed.  [man error::pass2]
Number of similar error messages suppressed: 1.
Rerun with -v to see them.
Tip: /usr/share/doc/systemtap/README.Debian should help you get started.

testfile2.stp
---
probe process.begin! {
	printf("Process begin, pid=%d\n", pid())
}

probe process.thread.begin! {
	printf("Thread begin, pid=%d\n", pid())
}

probe syscall.open {
	printf("Let's add one working probe")
}
---
$ stap -x 609 testfile2.stp
Makefile:10: *** mixed implicit and normal rules: deprecated syntax
Let's add one working probe
Let's add one working probe
...
Comment 5 Max Timchenko 2015-03-20 15:34:08 UTC
David: I tested the proposed fixes version and it works fine on my machine. Thanks!
Comment 6 David Smith 2015-03-20 15:47:06 UTC
I checked in the new version of the target_set tapset as commit 87859a5.
Comment 7 Josh Stone 2015-03-20 17:02:57 UTC
(In reply to Max Timchenko from comment #4)
> Josh: looks like uprobe support is there: I'm running 
> "Linux jessie 3.16.0-4-amd64 #1 SMP Debian 3.16.7-ckt7-1 (2015-03-01)", 
> $ grep UPRO /boot/config-3.16.0-4-amd64 
> CONFIG_ARCH_SUPPORTS_UPROBES=y
> CONFIG_UPROBES=y
> CONFIG_UPROBE_EVENT=y

Ok - it's not actually uprobes that matters in this exact case, but anyway your later test shows it working.

> New theory: the probe process.begin is silently skipped when running in
> target thread mode (-x). This might be the reason the target_set tracking
> probe fails to run even though the patch is present and uprobe support is
> there, since I always point stap at a particular root process of interest.

Ugh, I think we need a new bug for this.  I believe you're running into the code that turns "process" into "process(PID)"/"process("FILE") for -x/-c.  This is helpful for things like "process.function" which actually need to parse target information.

But in "process.begin" as used by target_set, we want to see all child processes too, not just the initial target.  AFAICT it is still doing that ok, when I try a test with -x on something spawning children, but for some reason it's validating the pid before it accepts the probe.

> testfile2.stp
> ---
> probe process.begin! {
> 	printf("Process begin, pid=%d\n", pid())
> }
> 
> probe process.thread.begin! {
> 	printf("Thread begin, pid=%d\n", pid())
> }
> 
> probe syscall.open {
> 	printf("Let's add one working probe")
> }
> ---
> $ stap -x 609 testfile2.stp
> Makefile:10: *** mixed implicit and normal rules: deprecated syntax
> Let's add one working probe
> Let's add one working probe
> ...

Strange that "!" lets these continue -- without those you'll still get the "cannot probe pid" error.  It's supposed to mark a probe sufficient, skipping any alternates listed, but it appears to be working like the "?" optional marker too.  I don't think that's intended -- it's only "optional" as long as something else in the chain resolves (or is marked truly "?" optional).
Comment 8 David Smith 2015-03-20 17:42:37 UTC
(In reply to Josh Stone from comment #7)

> Strange that "!" lets these continue -- without those you'll still get the
> "cannot probe pid" error.  It's supposed to mark a probe sufficient,
> skipping any alternates listed, but it appears to be working like the "?"
> optional marker too.  I don't think that's intended -- it's only "optional"
> as long as something else in the chain resolves (or is marked truly "?"
> optional).

The '!' marker means "optional and sufficient", not just "sufficient". Here's some text from the stapprobes man page:

====
       However, a probe point may be followed by a "?" character, to  indicate  
       that it is optional, and that no error should result if it fails to re-  
       solve.  Optionalness passes down through all levels  of  alias/wildcard  
       expansion.  Alternately, a probe point may be followed by a "!" charac-  
       ter, to indicate that it  is  both  optional  and  sufficient.
====
Comment 9 Josh Stone 2015-03-20 18:01:24 UTC
(In reply to David Smith from comment #8)
> The '!' marker means "optional and sufficient", not just "sufficient".

Ok.  It's kind of weird, because with '!' on the final (or sole) probepoint, the "sufficient" part doesn't really mean anything, so it's exactly like '?' in that case.  Oh well.