Josh Stone [Tue, 27 Oct 2009 19:15:29 +0000 (12:15 -0700)]
PR10854: Use a mutex around transport startup/shutdown
We had a race where the probe setup could be called during/after the
probe shutdown in abnormal circumstances, which leads to kernel
callbacks still registered after module unload. (BOOM)
Now the setup/shutdown activities and related flags are guarded by a
mutex, so we should have strict ordering.
* runtime/transport/transport.c (_stp_transport_mutex): New.
(_stp_handle_start): Grab the mutex, and make sure we're not exiting.
(_stp_cleanup_and_exit): Grab the mutex.
(_stp_lock_inode, _stp_unlock_inode): Use kernel version for checking
inode locking type.
Tim Moore [Tue, 27 Oct 2009 11:50:25 +0000 (12:50 +0100)]
Kill off child processes correctly on exit.
* grapher/grapher.cxx (ChildDeathReader::reap): New function.
(StapLauncher): Keep a list of instantiated parsers.
(StapLauncher::cleanup): Kill off all launched stap processes.
Tim Moore [Wed, 21 Oct 2009 15:05:59 +0000 (17:05 +0200)]
More refactoring for multiple stap processes.
* grapher/StapParser.hxx (StapParser): Change _win and _widget from
references to pointers.
* grapher/StapParser.cxx (ioCallback): Ditto.
* grapher/grapher.cxx (StapLauncher, GraphicalStapLauncher): Rewrite
to make GraphicalStapLauncher a derived class of StapLauncher.
(main): Accept graphing data from stdin with a "-" argument.
Tim Moore [Tue, 20 Oct 2009 20:14:00 +0000 (22:14 +0200)]
Change stap parser to use an input file descriptor other than stdin
* grapher/StapParser.hxx (_inFd, getInFd, setInFd): new member and fuctions
* grapher/StapParser.cxx (ioCallback): Use _inFd variable
instead of stdin.
* grapher/grapher.cxx (StapLauncher::launch): Don't read input from stap on
stdin; use the the read end of the pipe.
Josh Stone [Thu, 22 Oct 2009 21:37:05 +0000 (14:37 -0700)]
Enable Kbuild-like quiet builds
This enables much cleaner build output from automake. To re-enable the
verbose commands, pass --disable-silent-rules to configure, or use V=1
at make time.
* configure.ac: Enable AM_SILENT_RULES by default.
Josh Stone [Thu, 22 Oct 2009 02:27:17 +0000 (19:27 -0700)]
Correct the safety-net escape WRT locking
Within a probe body, the "out" label starts the normal exit path,
including unlocking whatever globals are used in that probe. Since the
unprivileged safety-net checks are before the locks are ever grabbed, we
should bypass the unlock on the way out.
* elaborate.cxx (derived_probe::emit_process_owner_assertion): Use
"return" instead of "goto out".
Josh Stone [Wed, 21 Oct 2009 23:15:58 +0000 (16:15 -0700)]
Refactor probe locking into shared functions
For scripts with thousands of probes, we save a fair amount of code-gen
time in pass-4 by having the common locking code extracted into shared
functions.
* runtime/probe_lock.h (stp_lock_probe, stp_unlock_probe): New.
* translate.cxx (c_unparser::emit_lock_decls): New, emits a static
const array of locks needed for each probe.
(c_unparser::emit_locks): Just call stp_lock_probe.
(c_unparser::emit_unlocks): Just call stp_unlock_probe.
Josh Stone [Sat, 10 Oct 2009 00:32:26 +0000 (17:32 -0700)]
PR10750: Enforce a reasonable limit on # of varargs
If we leave the number of args unbounded, then an excessively-sized
printf could cause a kernel stack overflow. I've arbitrarily chosen 32
as our new maximum.
* translate.cxx (c_unparser::visit_print_format): Throw if >32 args.
* testsuite/transko/varargs.stp: Assert that 33 args aren't allowed.
* testsuite/transok/varargs.stp: Assert that 32 args are ok.
Stan Cox [Tue, 20 Oct 2009 17:42:01 +0000 (13:42 -0400)]
Added testsuite to test xulrunner sdt markers.
xulrunner.exp: New testsuite, modelled after mysql.exp.
mysql.exp (stap-mysql.sh): Use installed stap.
postgres.exp (stap-mysql.sh): Use installed stap.
tcl.exp (stap-mysql.sh): Use installed stap.
Mark Wielaard [Tue, 20 Oct 2009 11:55:15 +0000 (13:55 +0200)]
Be paranoid about table size resolving cie_for_fde and fde_pointer_type.
* runtime/unwind.c (cie_for_fde): Take table and table_len into account.
(fde_pointer_type): Likewise.
* runtime/unwind/unwind.h: Adjust function prototypes.
Frank Ch. Eigler [Mon, 19 Oct 2009 15:33:24 +0000 (11:33 -0400)]
PR10799: warn on possibly uintended local-vs-global namespace collision
* elaborate.cxx (find_var): Take extra token parameter.
Look for cross-file global variable resolution, signal
a warning.
* testsuite/systemtap.examples/io/traceio2.stp: Fix it.
* testsuite/systemtap.syscall/sys.stp: Fix it.
* NEWS: Document it.
uprobe_fork_uproc() runs with parent_uproc->rwsem locked.
However uprobe_mk_process() that gets called within uprobe_fork_uproc()
also locks child_uproc->rwsem after initializing it.
Lockdep report confuses this to acquiring a lock that already has been
acquired and suggests using sub-classes.
The alternatives we have are:
1. use classes level to distinguish different uproc structures.
2. unlock parent_uproc->rwsem before we call uprobe_fork_uproc().
3. dont try locking child_uproc->rwsem; since we are protected by
uproc_mutex as well as parent_uproc->rwsem;
Mark Wielaard [Thu, 15 Oct 2009 19:55:16 +0000 (21:55 +0200)]
Fix transok/tval-opt.stp testcase. Pick diffent function and non-empty block.
This testcase succeeded just because the value being set couldn't be
found. So the error message being compared was the same. Set -o pipefail
to catch that case. On vta compiled kernels it failed because the optimizer
turned { statement } into statement. So pick a function and argument which
location can always be found and add an extra 'next' statement so the
block isn't folded.
* testsuite/transok/tval-opt.stp: Set -o pipefail. Add 'next' to make
sure block isn't empty. Use "do_filp_open" and "mode".
Josh Stone [Wed, 14 Oct 2009 23:12:49 +0000 (16:12 -0700)]
Fix $$targets in dwarf probes
My print_format refactoring in d5e178c1 missed an improperly-named
token, an sprint that should be sprintf. Since the token value is now
significant, that name needs to be correct.
Frank Ch. Eigler [Wed, 14 Oct 2009 21:02:47 +0000 (17:02 -0400)]
PR10331: improve nss error message handling
* stapsslerr.h: New file containing NSS* error number to string mappings.
Originally from mozilla NSS documentation, also seen in other GPLv2
software.
* nsscommon.c (nssError): Print error number, and text from <stapsslerr.h>.
* stap-{client,server}-connect.c (errWarn): Standardize on nssError().
* Makefile.am (nss binaries): Also link in nsscommon.c.
Tim Moore [Wed, 14 Oct 2009 15:46:31 +0000 (17:46 +0200)]
cleanup of graph data parser, using Boost functions where useful
* grapher/StapParser.cxx (commaSplit): Use Boost string split function
(findTaggedValue): Return bool instead of position
(ioCallback): Avoid using hard-coded string lengths
Josh Stone [Tue, 13 Oct 2009 23:57:38 +0000 (16:57 -0700)]
Consolidate print_format creation
We almost had a factory in print_format::parse_print, so let's take that
the rest of the way. This way we don't have so much duplication in
initializing the print flags.
* staptree.cxx (print_format::parse_print): Replaced with...
(print_format::create): New factory to parse and create print_formats.
* elaborate.cxx (add_global_var_display): Use this factory.
* parse.cxx (parser::parse_symbol): Ditto.
* tapset-mark.cxx
(mark_var_expanding_visitor::visit_target_symbol_context): Ditto.
* tapset-utrace.cxx
(utrace_var_expanding_visitor::visit_target_symbol_arg): Ditto.
* tapsets.cxx
(dwarf_var_expanding_visitor::visit_target_symbol_context): Ditto.
(tracepoint_var_expanding_visitor::visit_target_symbol_context) Ditto.
* transport/control.c (*_cmd): Return -Ecodes rather than "-1" from
file_operations callbacks.
* staprun/ctl.c (init_ctl_channel): Return distinct error codes.
* staprun/staprun.c (remove_module): Skip connection attempt to .ctl
file; just do delete_module() with O_NONBLOCK.
Josh Stone [Tue, 13 Oct 2009 21:10:08 +0000 (14:10 -0700)]
Refactor some of the histogram printing
* runtime/stat-common.c (reprint_buf): Removed.
(_stp_stat_print_histogram_buf): Use a local HIST_PRINTF macro to
abstract the buffer management. Also convert reprint_buf calls to
either %* formats or simple for-loops.
* parse.cxx (parser::parse_symbol): Add sprint[ln] to @hist_* hack.
* runtime/stat-common.c: Replace reprint with new reprint_buf, add more
generic _stp_stat_print_histogram_buf and call it from the older one.
Also correct some formatting issues.
* translate.cxx (c_unparser::visit_print_format): Add sprint case.
Use proper $vars according to CONFIG_NFSD and CONFIG_COMPAT in
syscall.nfsservctl and mask it out along with return probe if
CONFIG_NFSD != "[ym]" && CONFIG_COMPAT != "y".
* tapset/syscalls2.stp (syscall.nfsservctl): Fix it.
David Smith [Tue, 13 Oct 2009 13:55:57 +0000 (08:55 -0500)]
PR 10575. Improves running target commands.
* runtime/staprun/mainloop.c (signal_usr1): Renamed from signal_dontcare.
Sets a new variable, usr1_interrupt.
(start_cmd): Avoids pause() race condition by switching to blocking
SIGUSR1, then waiting on SIGUSR1 with sigsuspend().
Inline functions do not have an indentifiable return point and require
kernel built using VTA-enabled gcc to get tracking of variables. OTOH
syscall functions are very rarely inlined (depending on the compiler
mood), therefore filtering probes to include only non-inlined functions
ensures consistent behavior between different kernels.
This removes the problem of inaccessible variables in inlined syscalls
that is described in comments #6-9 to PR5890 and gives us the status quo
w.r.t. syscall probing, because before the commit solving PR10572
(b7478964) inline instances were masked anyway by non-inline ones.
You can check whether you have inlined syscalls using following command:
$ stap -l 'kernel.function("sys_*"),kernel.function("compat_sys_*")' \
2>&1 -vvv | awk '/^selected inline/{print $5}'
* tapset/syscalls.stp: Add .call to all entry probes.
* tapset/syscalls2.stp: Ditto.
Mark Wielaard [Fri, 9 Oct 2009 21:23:12 +0000 (23:23 +0200)]
Add task_time tapset, functions to query time resource usage of current task.
* tapset/task_time.stp: New tapset.
* testsuite/buildok/task_test.stp: Add new task_time functions.
* doc/SystemTap_Tapset_Reference/tapsets.tmpl: Add new section on
Task Time Tapset. Include tapset/task_time.stp.
Dave Brolley [Fri, 9 Oct 2009 15:09:12 +0000 (11:09 -0400)]
Generate safety net assertions in probe function not authorized for unprivileged users.
2009-10-08 Dave Brolley <brolley@redhat.com>
* elaborate.h (emit_unprivileged_assertion): New virtual method of deriv
ed_probe.
(emit_process_owner_assertion): New static method of derived_probe.
(check_unprivileged): New virtual method of derived_probe_builder.
(match_node::unprivileged_ok): Removed.
(match_node::allow_unprivileged): Removed.
(match_node::unprivileged_allowed): Removed.
* elaborate.cxx (translate.h): #include it.
(emit_unprivileged_assertion): New virtual method of derived_probe.
(emit_process_owner_assertion): New static method of derived_probe.
(check_unprivileged): New virtual method of derived_probe_builder.
(match_node::unprivileged_ok): Removed.
(match_node::allow_unprivileged): Removed.
(match_node::unprivileged_allowed): Removed.
(find_and_build): Don't check for unprivileged restrictions here. Call t
he
builder's check_unprivileged method.
(alias_expansion_builder::check_unprivileged): New virtual method.
* tapset-been.cxx (be_derived_probe::emit_unprivileged_assertion): New v
irtual
method.
(be_builder::check_unprivileged): Likewise.
(never_derived_probe::emit_unprivileged_assertion): Likewise.
(never_builder::check_unprivileged): Likewise.
(register_tapset_been): Don't call allow_unprivileged.
David J. Wilder [Thu, 8 Oct 2009 18:00:20 +0000 (11:00 -0700)]
This script (tcp_trace) can be used to trace tcp connection parameters and state changes. This work was original inspired by Stephen Hemminger's TCP cwnd snooper (net/ipv4/tcp_probe.c). Tcp_trace is a helpful tool for troubleshooting connection performance issues.
PR10702: preprocessor conditional for kernel CONFIG_foo
* session.h (kernel_config[]): New session field.
* main.cxx (parse_kernel_config): Populate it.
* parse.cxx (eval_comparison): Use it.
* testsuite/buildok/utrace.stp, testsuite/parseok/kconfig.stp: New tests.
* NEWS, stap.1.in, doc/langref.tex: Mention it.
Pending advice from Frank and Dave, changed check_permission to return void and
renamed it to assert_permission. assert_permission simply returns if
permissions are okay, and calls exit(-1) if there are any permissions errors.
Mark Wielaard [Tue, 6 Oct 2009 13:22:21 +0000 (15:22 +0200)]
PR10739 testcase. Split const_value test in two. Absolute const addr fails.
* testsuite/systemtap.base/const_value.exp: Handle both const_value blocks
and address separately. XFAIL second test as PR10739.
* testsuite/systemtap.base/const_value.stp: Only query baz const value.
* testsuite/systemtap.base/const_value_func.c: New test for bar address.
* testsuite/systemtap.base/const_value_func.stp: Likewise.
Josh Stone [Tue, 6 Oct 2009 00:41:30 +0000 (17:41 -0700)]
PR10726: Get the correct scope for statement(NUM)
The problem in this bug is that our statement(NUM) lookup was only
searching for the outermost function (not inlined) which contains the PC
in question. When that PC happens to be the beginning of the function
and also the beginning of an inline, the caching was using the wrong
variable scope.
The function/statement(NUM) lookup has been rewritten to bypass all of
the CU and function iteration, and just go straight to a getscopes(pc)
lookup, so it will now always use the innermost containing die for the
variable scope.
* tapsets.cxx (query_addr): New, short-circuit for numeric probes.
(dwarf_query::query_module_dwarf): Route num probes to query_addr.
(query_label): Assume now that we only need to handle _str probes.
(query_dwarf_inline_instance): Ditto.
(query_dwarf_func): Ditto.
(query_cu): Ditto.
Mark Wielaard [Mon, 5 Oct 2009 07:11:59 +0000 (09:11 +0200)]
Handle DW_AT_const_value as alternative to location description.
* dwflpp.cxx (translate_location): Call c_translate_constant when
attribute is DW_AT_const_value.
(literal_stmt_for_local): Allow both DW_AT_location and DW_AT_const_value.
Mark Wielaard [Mon, 5 Oct 2009 07:05:29 +0000 (09:05 +0200)]
Make sure loc2c declare_noncontig_union for different locs don't overlap.
* loc2c.c (declare_noncontig_union): Name union u_pieces for
loc_noncontiguous or u_const for loc_constant.
(translate_base_store): Use u_pieces for loc_noncontiguous.
(translate_base_fetch): Likewise or u_const for loc_constant.
Kiran Prakesh [Thu, 1 Oct 2009 17:09:32 +0000 (22:39 +0530)]
Scheduler Tapset based on kernel tracepoints
This patch adds kernel tracepoints based probes to the scheduler tapset
along with the testcase, scheduler-test-tracepoints.stp and an example
script, sched_switch.stp.
Signed-off-by: Kiran Prakash <kiran@linux.vnet.ibm.com> Signed-off-by: Josh Stone <jistone@redhat.com>
Mark Wielaard [Thu, 1 Oct 2009 22:28:46 +0000 (00:28 +0200)]
PR10678 vta-gcc: module debuginfo: relocation refers to undefined symbol
libdwfl tries to resolve all relocations in a module debuginfo file and
if it cannot find a symbol used in a relocation it will fail when
dwfl_module_getdwarf() is called. So we must make sure all possible
dependencies of the module are also in the dwfl. We do this by trying
to find and parse the modules.dep file and insert all dependencies
into the dwfl.
* setupdwfl.cxx (elfutils_kernel_path): Lift from setup_dwfl_kernel and
make static.
(is_comma_dash): New function.
(modname_from_path): Likewise.
(setup_mod_deps): Likewise.
(setup_dwfl_report_kernel_p): Call setup_mod_deps().
* testsuite/buildok/pr10678.stp: New test.
Stan Cox [Thu, 1 Oct 2009 13:18:21 +0000 (09:18 -0400)]
Add DEBUG_UPROBES for sdt semaphores.
* tapsets.cxx (uprobe_derived_probe_group::emit_module_decls): Add
DEBUG_UPROBES for sdt semaphores
* dtrace.in (main): Add -k option to keep around the temp files.
Breno Leitao [Thu, 1 Oct 2009 03:09:07 +0000 (00:09 -0300)]
Actually indent_thread() is a very useful function, but
sometimes you're probing something that is not related to
any task, as an interrupt function, and if the application
changes during the interrupt, the indentation gets confused.
For example: