This is the mail archive of the
systemtap@sourceware.org
mailing list for the systemtap project.
Re: Better handling of arguments/literals from the command-line?
Hi Ted,
On Sun, 2008-10-05 at 23:05 -0400, Theodore Tso wrote:
> On Sun, Oct 05, 2008 at 10:47:09AM +0200, Mark Wielaard wrote:
> >
> > It is less than 10 seconds on my machine, but still some seconds. Which
> > does indeed seem somewhat long if we can find the cached files
> > eventually.
>
> Really? I wonder why it takes so long on my system. (An X61s laptop,
> with an 1.6Ghz Intel Core 2 Duo, 2gigs memory, 7200rpm hard drive)
> What sort of system do you have?
Well, less than 10, but still 8 seconds (it felt less yesterday).
Workstation: 3.2Ghz, 2GB, 10.000rpm
> Is there some way the initial
> compilation can be sped up? I considered it rather disturbing that it
> was taking so long; at greater than 10-15 seconds, usability design
> really demands a progress bar, and some indication of when the
> systemtap script is actually executing as opposed to compiling.
stap -v gives you at least the timings of Pass 1-5. So you know where
the time is being spend.
This is almost completely Pass 2 ():
Pass 2: analyzed script: 792 probe(s), 118 function(s), 14 embed(s),
1 global(s) in 7810usr/480sys/8309real ms.
Pass 3 and 4 used cached files after the first run and so don't take
much time.
The analyzing translates the syscall.*[.return] probes into almost 800
individual kernel.function("sys_*")[.return] probes (with -vv you can
see which). And then for all these individual probes it calculates how
to get the arguments from the dwarf info to create the argstr and
returnstr used in the script. I am somewhat surprised this takes so
long, but this seems indeed what it is mainly doing (opreport -l):
samples % image name symbol name
41080 13.1230 stap query_module(Dwfl_Module*, void**, char const*, unsigned long, void*)
29392 9.3893 libdw.so.1 __libdw_find_attr
17950 5.7341 libdw.so.1 __libdw_form_val_len
17344 5.5406 libdw.so.1 lookup
11379 3.6350 libc-2.8.so strlen
10429 3.3316 libstdc++.so.6.0.10 std::basic_string<char, std::char_traits<char>, std::allocator<char> >::basic_string(std::string const&)
9987 3.1904 stap std::tr1::__detail::_Map_base<std::string, std::pair[.truncated.]>::operator[](std::string const&)
8064 2.5761 libc-2.8.so _int_malloc
6993 2.2339 stap dwflpp::iterate_over_functions(int (*)(Dwarf_Die*, void*), void*)
6429 2.0538 libc-2.8.so memcpy
6059 1.9356 libdw.so.1 dwarf_siblingof
5818 1.8586 libstdc++.so.6.0.10 std::string::assign(std::string const&)
One contributing factor here is that I have 1715 kernel modules
installed (standard Fedora it seems) and that for each probe point (800
in this script) query_module() is called for each of them (so that is
1.372.000 calls). Although query_module() does seem to cache its results
for each module, that is still a lot. Of course in this case all the
probes are for sys_* functions which are never in a module, but
systemtap doesn't know that.
It would be nice if we could simply cache the results of Pass 2 also
when the .stp and debuginfo files used haven't changed. Then rerunning a
script would be almost instant.
Cheers,
Mark