This is the mail archive of the systemtap@sourceware.org mailing list for the systemtap project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

Re: Missing appname in lket output

From: Li Guanglei <guanglei at cn dot ibm dot com>
To: david_list at boreham dot org
Cc: SystemTap <systemtap at sources dot redhat dot com>
Date: Thu, 09 Nov 2006 08:46:08 +0800
Subject: Re: Missing appname in lket output
Organization: IBM CSTL
References: <45526739.8090605@boreham.org>

Hi David,

Thanks for trying. Here are my answers:

David Boreham wrote:

The first is that sometimes the appname is missing.
It's not there either in the text output nor in the SQL database
after processing by lket_b2a. I am using the latest code
from cvs.

My command is:

stap -bM -e "probe addevent.iosyscall.* {}" -v

After LKET starts, process_snapshot() will be called, but addevent.process.fork and addevent.process.execve won't be turned on by default. So if you have any new process to be forked/execved after LKET starts up, you will lost the appname info. This is why you see the appname listed as NULL.

So you can:
stap -bMv -e "probe addevent.iosyscall {} probe addevent.process {}"
to capture the newly created processes.

I ever thought of turning on fork/execve events by default, but finally I left it to the users to decide whether they want to capture such events. Do you think we should turn on fork/execve capture by default since they have a negligible overhead?


I also noticed that the event name is no longer written to
the ascii output file. This appears to be deliberate since
there is no code to do it now (but the man page shows
this data in the output so presumably it was at one time).

I don't remember that LKET ever wrote event names into lket.out. Only group id and hook id is written. Can you paste here a snippet of the output?


BTW dumping to SQL is either a really cool idea or
really, really, annoying : I'm not sure which yet ;)
I feel like the developers that will understand enough  about
SQL to data mine the dump may not overlap with the
set of developers that are interested in kernel tracing
in the first place. I keep wanting to write a C or
at least Python program to crawl the trace files and
extract the stuff I'm looking for. I'm particularly interested
in the cluster of syscalls related to each file : correlating
the open/read/write/fsync/close calls relating to the same file.
I know that I could write some hideous join statement
in SQL to do the same but my SQL expertise is not
quite at that level yet. e.g. I want to find all system calls
that took > 10ms to execute : the data is there in the table
but ugh, I need to go read a SQL book to figure out the
query. If that information were in the ascii output file
I could write a simple script to grok the records,
compute the delta between the entry and return
timestamps and print out any that exceeded 10ms.

Couple of reasons for postprocessing in MySQL: 1. faster speed, capable to handle large volume data 2. standard interface, SQL is very good at querying 3. can be used as a backend of different user specific application, such as those written in php, java etc. 4. Easier to correlate different events(tables) to dig out userful info 5. avoid read the whole trace data again and again to find out different information.

But I agree with you that it requires you have SQL knowledge, especially when you want optimize the query. I ever thought of writing a collection of store procedures to be shipped with lket-b2a. Just like the application libraries such as glibc which provide the desired functionality while encapsulate the implementation details, the goal is that what the uses need to do now is only to pick up a store procedure from the database and call it, and the store procedure will give the user the desired result. So the user doesn't need to know much database table structure detail, nor much knowledge of relation database.Here I paste an example I wrote before about this. A more complicated example will have passing in arguments to make it more flexible:

This example will generate top 10 most frequently syscalls

the table to store syscall trace data is defined as:

mysql> describe syscall;
+---------+---------------------+------+-----+---------+-------+
| Field   | Type                | Null | Key | Default | Extra |
+---------+---------------------+------+-----+---------+-------+
| hookid  | tinyint(3) unsigned | NO   |     | NULL    |       |
| usec    | int(10) unsigned    | NO   | MUL | NULL    |       |
| pid     | int(10) unsigned    | NO   |     | NULL    |       |
| cpu     | tinyint(3) unsigned | NO   |     | NULL    |       |
| sysname | varchar(24)         | NO   |     | NULL    |       |
+---------+---------------------+------+-----+---------+-------+
5 rows in set (0.00 sec)

The data stored in syscall table looks like:

mysql> select * from syscall limit 6;
+--------+--------+------+-----+--------------+
| hookid | usec   | pid  | cpu | sysname      |
+--------+--------+------+-----+--------------+
|      1 | 112080 | 2942 |   6 | rt_sigaction |
|      2 | 112089 | 2942 |   6 | rt_sigaction |
|      1 | 112097 | 2942 |   6 | alarm        |
|      2 | 112102 | 2942 |   6 | alarm        |
|      1 | 112107 | 2942 |   6 | rt_sigaction |
|      2 | 112110 | 2942 |   6 | rt_sigaction |
+--------+--------+------+-----+--------------+
6 rows in set (0.00 sec)

Here the hookid=1 means entry of a syscall and hookid=2 means return from a syscall

The store procedure to generate top 10 most frequently being called syscalls is defined as:

mysql> create procedure top10_syscall()
    -> begin
    ->   select sysname, count(*) from syscall
    ->   where hookid=1
    ->     group by sysname
    ->     order by count(*) DESC
    ->     limit 10;
    -> end
    -> //
Query OK, 0 rows affected (0.01 sec)

Now the user can just call the procedure to get the top 10 syscalls:

mysql> call top10_syscall();
+--------------------+----------+
| sysname            | count(*) |
+--------------------+----------+
| kill               |  2646443 |
| getdents           |  1382303 |
| stat               |  1160844 |
| fstat              |  1139168 |
| close              |  1028697 |
| pread64            |   717298 |
| fcntl              |   694464 |
| compat_sys_fcntl64 |   694463 |
| pwrite64           |   230360 |
| unlink             |    92181 |
+--------------------+----------+
10 rows in set (17.84 sec)

Query OK, 0 rows affected (17.84 sec)

Anyway, LKET is cool so enough complaining.

Follow-Ups:
- Re: Missing appname in lket output
  - From: David Boreham

References:
- Missing appname in lket output
  - From: David Boreham

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]