This is the mail archive of the systemtap@sourceware.org mailing list for the systemtap project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

Re: User-space probes: Plan B+

From: Jim Keniston <jkenisto at us dot ibm dot com>
To: James Dickens <jamesd dot wi at gmail dot com>
Cc: SystemTAP <systemtap at sources dot redhat dot com>
Date: 25 Aug 2006 11:22:51 -0700
Subject: Re: User-space probes: Plan B+
Organization:
References: <1156468404.2848.5.camel@dyn9047018079.beaverton.ibm.com> <cd09bdd10608250111p4601aea2r20bca9d2fbc3df22@mail.gmail.com>

On Fri, 2006-08-25 at 01:11, James Dickens wrote:
> On 24 Aug 2006 18:13:24 -0700, Jim Keniston <jkenisto@us.ibm.com> wrote:
...
> >
> > I tried an approach based on ptrace, with no kernel enhancements, but
> > it lacked certain necessary features (e.g., #2-5 below), probe overhead
> > was 12-15x worse than Prasanna's approach, and I couldn't get it to
> > work when probing multiple processes.  (Frank Eigler independently
> > suggested this approach and termed it "Plan B from outer space.")
> >
> 
> is 12-15x worse than the current solution used in strace?

Slightly worse.  When just counting the occurrences of 1 system call, I
clocked strace at about 10 usec/hit.  See
http://sourceware.org/ml/systemtap/2006-q2/msg00572.html
And some folks reportedly consider strace too slow.

...
> > 1. Instrumentation can be coded entirely as a user-space app...
> 
> sounds like a nightmare waiting to happen, if i want to trace
> something from userland into the kernel and back, i start writing
> userland code, then into kernel code, and quite possibly having kernel
> code access variables and statisics stored in userland, meaning lots
> of checks that the user remembers to call the routines that safely
> move data back and forth between the two?

Well, sure, users could get confused and do things wrong.  And your
scenario below where you migrate a piece of instrumentation from user
space to kernel space would have to be managed carefully, just like any
other design change.

But I think it's better to provide a feature for which a need has been
identified -- even if the feature requires careful use and a few minutes
to understand -- than to withhold the feature to protect people from
failing.  (I consider asm statements in gcc an extreme example of this
philosophy. :-))

> 
> how is this better than just enhancing a debugger such as gdb?

Among other things, gdb -batch is relatively slow (I measured 111 usec
per hit just to count breakpoint hits) and has no facility for
interacting with kernel-space instrumentation.

> how are
> stacks dealt with, since you quite possibly having one process
> investigate another, if you don't get everything perfect the program
> being watched can corrupt the data of the second?

Well, somebody with root privileges could register a handler that
scribbles just about anywhere, as is the case currently with kprobes. 
But there's no reason to expect that there's any danger of the
particular problems you mention.

> 
> >
> > 2. ... but in situations where performance is critical, uprobes can
> > run a named kernel handler without waking up the tracer process.
> >
> now if we start out coding our script to only work in userland, then
> all of a sudden we decide we need better performance, we have to go
> back and recode parts to work in kernel land and quite possibly break
> our algorythms that were talking to kernel land, or probes in the
> kernel that accessed userland data that just moved back into the
> kernel?

See above.

> 
> > 3. A user-mode tracer can invoke a previously registered kernel-mode
> > handler, so we have simple and efficient communication between user-
> > and kernel-mode instrumentation.
> 
> how do you keep a userland program from exploiting systemtaps
> arcutecture and executing kernel probes from other active systemtap
> scripts, isn't this a huge back door for rootkits especially once
> people start using systemtaps methods for monitoring systems
> continuously?

I've certainly thought about the potential for abuse via
uprobe_run_khandler().  If you had the connivance of somebody with root
privileges who installed a pernicious handler, you could do all sorts of
bad stuff (and make it relatively hard to track).  That's a big if,
though.  If a bad guy has root privileges, you're toast anyway.

And if you're worried about the handler reading/writing the wrong
process's address space, you can specify when you register the handler
that it can apply only to the process in the caller-provided uprobe
object -- and only when the caller has permission to trace that process.

...
> >
> > 8. Handlers run in process context -- the tracee's context (see
> > requirement 2) or the tracer's context while the tracee is stopped
> > (see requirement 3).
> >
> 
> stack corruption or even slight stack placement differences, would
> serverly limit the usefulness of the solution,

Well, yes, both we and the user will have to be careful.  That's the
nature of programming.

> it will have the same
> effect as debugging an app in gdb, the app only breaks when the
> userland debugger is not running.

That (minimizing probe overhead) is one of the points of being able to
avoid unnecessary context switches, by just running a handler in the
kernel.  (See requirement #2.)

> 
> 
> James Dickens
> uadmin.blogspot.com

Thanks.
Jim

Follow-Ups:
- Re: User-space probes: Plan B+
  - From: James Dickens

References:
- User-space probes: Plan B+
  - From: Jim Keniston
- Re: User-space probes: Plan B+
  - From: James Dickens

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]