This is the mail archive of the mailing list for the systemtap project.

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: stptracer-20060828 has released.

Masami Hiramatsu wrote:
> Hi,
> I released the latest STPTracer scripts including a logfile merging tool.
> You can download it from this URL:
> This tracer-script covers about a half of events which LKST provides.
> And LKST Log Tools can treat the logfiles recorded by this script.
> This version of STPTracer does not support the flight-recorder feature yet.
> I think we must fix the format of the temporary per-cpu files before that.
> By the way, in this version, I still use gBTI which I proposed before,
> because it is good enough for simple transfer of binary data.
> Best regards.

The first thing I want to figure out about STPTracer is how it
performs compared with LKET. STPTracer uses an interface named gBTI
which could only print fix number integers into a pre-reserved relayfs
buffer, while LKET uses the _stp_printf with binary print support.

The following are the LKET script and STPTracer script I used in my
tests. _lket_trace() is a wrapper around _stp_printf() and
lkst_trace_n() is a wrapper around _stp_binary_write():


function foo()
        _lket_trace(1, 1, "%8b%8b%8b%8b", 111,22,333,444);

probe syscall.getsid {


probe syscall.getsid {
        lkst_trace_n (4, 1, 12, 222, 333, 4444)

Then I use a single-thread application named app_getsid with will call
getsid() in a loop. Here is the testing data:

<1> run app_getsid with being probed:
cpu 0: loops=5000000, average=442 ns

<2> run app_getsid with an stp script with empty probe handlers, i.e.,
       probe syscall.getsid {}
cpu 0: loops=5000000, average=1523 ns

<3> probe using lket_getsid.stp:
cpu 0: loops=5000000, average=3079 ns

<4> probe using lkst_getsid.stp:
cpu 0: loops=5000000, average=2341 ns

Here you can see that gBTI is around 23% faster than _stp_printf(). In
fact a long time ago I used oprofile to find out where _stp_printf()
cost cpu time, and it showed that a lot time is spent on _stp_vsnprintf().

So we use _stp_printf() for its fancy printing format in trade of its
slower speed compared with gBTI.  But the interface like gBTI imposes
too much restriction on trace data format and the number of data items
to be traced. Maybe we should find some places inside _stp_printf()
for further performance improvement while still have the capability to
print data freely.

- Guanglei

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]