This is the mail archive of the systemtap@sourceware.org mailing list for the systemtap project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

Re: stptracer-20060828 has released.

From: Li Guanglei <guanglei at cn dot ibm dot com>
To: "Frank Ch. Eigler" <fche at redhat dot com>
Cc: systemtap at sources dot redhat dot com
Date: Tue, 05 Sep 2006 10:30:12 +0800
Subject: Re: stptracer-20060828 has released.
Organization: IBM CSTL
References: <44F58A51.6060806@hitachi.com> <44F69E3C.10308@cn.ibm.com> <y0mr6ywddx7.fsf@ton.toronto.redhat.com> <44F7A3EE.1060101@hitachi.com> <44F7AD39.1010707@cn.ibm.com> <20060901203741.GG23264@redhat.com>

Frank Ch. Eigler wrote:

Hi -

On Fri, Sep 01, 2006 at 11:47:05AM +0800, Li Guanglei wrote:

[...]  But I have some questions about compiled approach. stap will
treat all integer data as 64-bit, but most binary trace integer data
items need only 1 or 2 bytes. [...] And will it use a fix length for
all string items? Some string trace data items only have a few
chars.


The translator could apply obvious policies to this question.  If the
formatting string includes "%1b", it would write just the low-order
byte of the integer.  If the formatting string includes "%64s", this
would result in a fixed-width substring field.  For plain "%s", it
would be a dynamic-width field, which alone would not be ideal for
binary format streams.

The idea here is to make this compiled tracing a pure optimization:
not to change the script input nor data output, but just to produce it
quicker.

Thanks. I see.

[...] But in some situations I will put the print/trace statement in
embedded c codes to avoid the calls to function__dwarf_tvar_get*.


If you don't call those tvar_get functions, how are you safely
extracting target data?

One example is the struct scsi_cmnd in scsi trace hooks in LKET. I
need to retrieve 10 arguments from this struct. In the embedded c
codes they are only 10 assignments while in stap script they will be
extended to 10 function calls.


In exchange for those function calls, you're getting safe
dereferencing of all those values.  How much slowdown is
that worth?

I run a series of tests to figure it out. I use the same single-thread application named app_getsid with will call getsid() in a loop, the results are:

<1> without probe

cpu 0: loops=5000000, average=392 ns

<2> probe using LKET(2.stp): log 4 integers

cpu 0: loops=5000000, average=2280 ns

<3> probe using LKET(3.stp): log 4 integers + visit 5 target variables

cpu 0: loops=5000000, average=2445 ns

<4> probe using LKST(4.stp): log 4 integers

cpu 0: loops=5000000, average=1710 ns

<5> probe using LKST(5.stp): log 4 integers + visit 5 target variables

cpu 0: loops=5000000, average=1859 ns

The testing shows that visiting 5 target variables will cost ~150ns, and LKST is ~570ns faster than LKET if only log 4 integers. I think the compiled approach should have almost the same speed as <5>, so it will roughly be at least ~420ns faster than LKET.

But it's not common that the trace handler will visit a number of data members of a single struct. In most cases it will just visit different target variables and in such case LKET will also need the same number of tvar_get functions as the compiled approach does in script level.

Most target variables can be visited safely unless there is a bug in Kernel. Those target variables that could potentially have invalid values need explicitly checking before use them, and LKET needs a piece of c codes to check them. But we can't make sure every one doesn't miss some checking codes. So considering the benefit of tvar_get() and its cost, I think it is worth to use tvar_get().

The following is the stp scripts.

2.stp:

  function foo()
  %{
     _lket_trace(1, 1, "%8b%8b%8b%8b", 111,22,333,444);
  %}

  probe syscall.getsid {
    foo()
  }

3.stp:

 function foo(a:long, b:long, c:long, d:long, e:long)
  %{
      _lket_trace(1, 1, "%8b%8b%8b%8b", 111,22,333,444);
  %}

  probe syscall.getsid {
     foo($pid, $pid, $pid, $pid, $pid)
  }

4.stp:

  probe syscall.getsid {
     lkst_trace_n (4, 1, 12, 222, 333, 4444)
  }

5.stp:

  function foo:long(a:long, b:long, c:long, d:long, e:long)
  %{
     THIS->__retvalue = 0;
  %}

  probe syscall.getsid {
     foo($pid, $pid, $pid, $pid, $pid)
     lkst_trace_n (4, 1, 12, 222, 333, 4444)
  }

- Guanglei

Follow-Ups:
- Re: stptracer-20060828 has released.
  - From: Li Guanglei

References:
- stptracer-20060828 has released.
  - From: Masami Hiramatsu
- Re: stptracer-20060828 has released.
  - From: Li Guanglei
- Re: stptracer-20060828 has released.
  - From: Frank Ch. Eigler
- Re: stptracer-20060828 has released.
  - From: Masami Hiramatsu
- Re: stptracer-20060828 has released.
  - From: Li Guanglei
- Re: stptracer-20060828 has released.
  - From: Frank Ch. Eigler

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]