Frank Ch. Eigler [Mon, 19 Oct 2009 15:33:24 +0000 (11:33 -0400)]
PR10799: warn on possibly uintended local-vs-global namespace collision
* elaborate.cxx (find_var): Take extra token parameter.
Look for cross-file global variable resolution, signal
a warning.
* testsuite/systemtap.examples/io/traceio2.stp: Fix it.
* testsuite/systemtap.syscall/sys.stp: Fix it.
* NEWS: Document it.
uprobe_fork_uproc() runs with parent_uproc->rwsem locked.
However uprobe_mk_process() that gets called within uprobe_fork_uproc()
also locks child_uproc->rwsem after initializing it.
Lockdep report confuses this to acquiring a lock that already has been
acquired and suggests using sub-classes.
The alternatives we have are:
1. use classes level to distinguish different uproc structures.
2. unlock parent_uproc->rwsem before we call uprobe_fork_uproc().
3. dont try locking child_uproc->rwsem; since we are protected by
uproc_mutex as well as parent_uproc->rwsem;
Mark Wielaard [Thu, 15 Oct 2009 19:55:16 +0000 (21:55 +0200)]
Fix transok/tval-opt.stp testcase. Pick diffent function and non-empty block.
This testcase succeeded just because the value being set couldn't be
found. So the error message being compared was the same. Set -o pipefail
to catch that case. On vta compiled kernels it failed because the optimizer
turned { statement } into statement. So pick a function and argument which
location can always be found and add an extra 'next' statement so the
block isn't folded.
* testsuite/transok/tval-opt.stp: Set -o pipefail. Add 'next' to make
sure block isn't empty. Use "do_filp_open" and "mode".
Josh Stone [Wed, 14 Oct 2009 23:12:49 +0000 (16:12 -0700)]
Fix $$targets in dwarf probes
My print_format refactoring in d5e178c1 missed an improperly-named
token, an sprint that should be sprintf. Since the token value is now
significant, that name needs to be correct.
Frank Ch. Eigler [Wed, 14 Oct 2009 21:02:47 +0000 (17:02 -0400)]
PR10331: improve nss error message handling
* stapsslerr.h: New file containing NSS* error number to string mappings.
Originally from mozilla NSS documentation, also seen in other GPLv2
software.
* nsscommon.c (nssError): Print error number, and text from <stapsslerr.h>.
* stap-{client,server}-connect.c (errWarn): Standardize on nssError().
* Makefile.am (nss binaries): Also link in nsscommon.c.
Tim Moore [Wed, 14 Oct 2009 15:46:31 +0000 (17:46 +0200)]
cleanup of graph data parser, using Boost functions where useful
* grapher/StapParser.cxx (commaSplit): Use Boost string split function
(findTaggedValue): Return bool instead of position
(ioCallback): Avoid using hard-coded string lengths
Josh Stone [Tue, 13 Oct 2009 23:57:38 +0000 (16:57 -0700)]
Consolidate print_format creation
We almost had a factory in print_format::parse_print, so let's take that
the rest of the way. This way we don't have so much duplication in
initializing the print flags.
* staptree.cxx (print_format::parse_print): Replaced with...
(print_format::create): New factory to parse and create print_formats.
* elaborate.cxx (add_global_var_display): Use this factory.
* parse.cxx (parser::parse_symbol): Ditto.
* tapset-mark.cxx
(mark_var_expanding_visitor::visit_target_symbol_context): Ditto.
* tapset-utrace.cxx
(utrace_var_expanding_visitor::visit_target_symbol_arg): Ditto.
* tapsets.cxx
(dwarf_var_expanding_visitor::visit_target_symbol_context): Ditto.
(tracepoint_var_expanding_visitor::visit_target_symbol_context) Ditto.
* transport/control.c (*_cmd): Return -Ecodes rather than "-1" from
file_operations callbacks.
* staprun/ctl.c (init_ctl_channel): Return distinct error codes.
* staprun/staprun.c (remove_module): Skip connection attempt to .ctl
file; just do delete_module() with O_NONBLOCK.
Josh Stone [Tue, 13 Oct 2009 21:10:08 +0000 (14:10 -0700)]
Refactor some of the histogram printing
* runtime/stat-common.c (reprint_buf): Removed.
(_stp_stat_print_histogram_buf): Use a local HIST_PRINTF macro to
abstract the buffer management. Also convert reprint_buf calls to
either %* formats or simple for-loops.
* parse.cxx (parser::parse_symbol): Add sprint[ln] to @hist_* hack.
* runtime/stat-common.c: Replace reprint with new reprint_buf, add more
generic _stp_stat_print_histogram_buf and call it from the older one.
Also correct some formatting issues.
* translate.cxx (c_unparser::visit_print_format): Add sprint case.
Use proper $vars according to CONFIG_NFSD and CONFIG_COMPAT in
syscall.nfsservctl and mask it out along with return probe if
CONFIG_NFSD != "[ym]" && CONFIG_COMPAT != "y".
* tapset/syscalls2.stp (syscall.nfsservctl): Fix it.
David Smith [Tue, 13 Oct 2009 13:55:57 +0000 (08:55 -0500)]
PR 10575. Improves running target commands.
* runtime/staprun/mainloop.c (signal_usr1): Renamed from signal_dontcare.
Sets a new variable, usr1_interrupt.
(start_cmd): Avoids pause() race condition by switching to blocking
SIGUSR1, then waiting on SIGUSR1 with sigsuspend().
Inline functions do not have an indentifiable return point and require
kernel built using VTA-enabled gcc to get tracking of variables. OTOH
syscall functions are very rarely inlined (depending on the compiler
mood), therefore filtering probes to include only non-inlined functions
ensures consistent behavior between different kernels.
This removes the problem of inaccessible variables in inlined syscalls
that is described in comments #6-9 to PR5890 and gives us the status quo
w.r.t. syscall probing, because before the commit solving PR10572
(b7478964) inline instances were masked anyway by non-inline ones.
You can check whether you have inlined syscalls using following command:
$ stap -l 'kernel.function("sys_*"),kernel.function("compat_sys_*")' \
2>&1 -vvv | awk '/^selected inline/{print $5}'
* tapset/syscalls.stp: Add .call to all entry probes.
* tapset/syscalls2.stp: Ditto.
Mark Wielaard [Fri, 9 Oct 2009 21:23:12 +0000 (23:23 +0200)]
Add task_time tapset, functions to query time resource usage of current task.
* tapset/task_time.stp: New tapset.
* testsuite/buildok/task_test.stp: Add new task_time functions.
* doc/SystemTap_Tapset_Reference/tapsets.tmpl: Add new section on
Task Time Tapset. Include tapset/task_time.stp.
Dave Brolley [Fri, 9 Oct 2009 15:09:12 +0000 (11:09 -0400)]
Generate safety net assertions in probe function not authorized for unprivileged users.
2009-10-08 Dave Brolley <brolley@redhat.com>
* elaborate.h (emit_unprivileged_assertion): New virtual method of deriv
ed_probe.
(emit_process_owner_assertion): New static method of derived_probe.
(check_unprivileged): New virtual method of derived_probe_builder.
(match_node::unprivileged_ok): Removed.
(match_node::allow_unprivileged): Removed.
(match_node::unprivileged_allowed): Removed.
* elaborate.cxx (translate.h): #include it.
(emit_unprivileged_assertion): New virtual method of derived_probe.
(emit_process_owner_assertion): New static method of derived_probe.
(check_unprivileged): New virtual method of derived_probe_builder.
(match_node::unprivileged_ok): Removed.
(match_node::allow_unprivileged): Removed.
(match_node::unprivileged_allowed): Removed.
(find_and_build): Don't check for unprivileged restrictions here. Call t
he
builder's check_unprivileged method.
(alias_expansion_builder::check_unprivileged): New virtual method.
* tapset-been.cxx (be_derived_probe::emit_unprivileged_assertion): New v
irtual
method.
(be_builder::check_unprivileged): Likewise.
(never_derived_probe::emit_unprivileged_assertion): Likewise.
(never_builder::check_unprivileged): Likewise.
(register_tapset_been): Don't call allow_unprivileged.
David J. Wilder [Thu, 8 Oct 2009 18:00:20 +0000 (11:00 -0700)]
This script (tcp_trace) can be used to trace tcp connection parameters and state changes. This work was original inspired by Stephen Hemminger's TCP cwnd snooper (net/ipv4/tcp_probe.c). Tcp_trace is a helpful tool for troubleshooting connection performance issues.
PR10702: preprocessor conditional for kernel CONFIG_foo
* session.h (kernel_config[]): New session field.
* main.cxx (parse_kernel_config): Populate it.
* parse.cxx (eval_comparison): Use it.
* testsuite/buildok/utrace.stp, testsuite/parseok/kconfig.stp: New tests.
* NEWS, stap.1.in, doc/langref.tex: Mention it.
Pending advice from Frank and Dave, changed check_permission to return void and
renamed it to assert_permission. assert_permission simply returns if
permissions are okay, and calls exit(-1) if there are any permissions errors.
Mark Wielaard [Tue, 6 Oct 2009 13:22:21 +0000 (15:22 +0200)]
PR10739 testcase. Split const_value test in two. Absolute const addr fails.
* testsuite/systemtap.base/const_value.exp: Handle both const_value blocks
and address separately. XFAIL second test as PR10739.
* testsuite/systemtap.base/const_value.stp: Only query baz const value.
* testsuite/systemtap.base/const_value_func.c: New test for bar address.
* testsuite/systemtap.base/const_value_func.stp: Likewise.
Josh Stone [Tue, 6 Oct 2009 00:41:30 +0000 (17:41 -0700)]
PR10726: Get the correct scope for statement(NUM)
The problem in this bug is that our statement(NUM) lookup was only
searching for the outermost function (not inlined) which contains the PC
in question. When that PC happens to be the beginning of the function
and also the beginning of an inline, the caching was using the wrong
variable scope.
The function/statement(NUM) lookup has been rewritten to bypass all of
the CU and function iteration, and just go straight to a getscopes(pc)
lookup, so it will now always use the innermost containing die for the
variable scope.
* tapsets.cxx (query_addr): New, short-circuit for numeric probes.
(dwarf_query::query_module_dwarf): Route num probes to query_addr.
(query_label): Assume now that we only need to handle _str probes.
(query_dwarf_inline_instance): Ditto.
(query_dwarf_func): Ditto.
(query_cu): Ditto.
Mark Wielaard [Mon, 5 Oct 2009 07:11:59 +0000 (09:11 +0200)]
Handle DW_AT_const_value as alternative to location description.
* dwflpp.cxx (translate_location): Call c_translate_constant when
attribute is DW_AT_const_value.
(literal_stmt_for_local): Allow both DW_AT_location and DW_AT_const_value.
Mark Wielaard [Mon, 5 Oct 2009 07:05:29 +0000 (09:05 +0200)]
Make sure loc2c declare_noncontig_union for different locs don't overlap.
* loc2c.c (declare_noncontig_union): Name union u_pieces for
loc_noncontiguous or u_const for loc_constant.
(translate_base_store): Use u_pieces for loc_noncontiguous.
(translate_base_fetch): Likewise or u_const for loc_constant.
Kiran Prakesh [Thu, 1 Oct 2009 17:09:32 +0000 (22:39 +0530)]
Scheduler Tapset based on kernel tracepoints
This patch adds kernel tracepoints based probes to the scheduler tapset
along with the testcase, scheduler-test-tracepoints.stp and an example
script, sched_switch.stp.
Signed-off-by: Kiran Prakash <kiran@linux.vnet.ibm.com> Signed-off-by: Josh Stone <jistone@redhat.com>
Mark Wielaard [Thu, 1 Oct 2009 22:28:46 +0000 (00:28 +0200)]
PR10678 vta-gcc: module debuginfo: relocation refers to undefined symbol
libdwfl tries to resolve all relocations in a module debuginfo file and
if it cannot find a symbol used in a relocation it will fail when
dwfl_module_getdwarf() is called. So we must make sure all possible
dependencies of the module are also in the dwfl. We do this by trying
to find and parse the modules.dep file and insert all dependencies
into the dwfl.
* setupdwfl.cxx (elfutils_kernel_path): Lift from setup_dwfl_kernel and
make static.
(is_comma_dash): New function.
(modname_from_path): Likewise.
(setup_mod_deps): Likewise.
(setup_dwfl_report_kernel_p): Call setup_mod_deps().
* testsuite/buildok/pr10678.stp: New test.
Stan Cox [Thu, 1 Oct 2009 13:18:21 +0000 (09:18 -0400)]
Add DEBUG_UPROBES for sdt semaphores.
* tapsets.cxx (uprobe_derived_probe_group::emit_module_decls): Add
DEBUG_UPROBES for sdt semaphores
* dtrace.in (main): Add -k option to keep around the temp files.
Breno Leitao [Thu, 1 Oct 2009 03:09:07 +0000 (00:09 -0300)]
Actually indent_thread() is a very useful function, but
sometimes you're probing something that is not related to
any task, as an interrupt function, and if the application
changes during the interrupt, the indentation gets confused.
For example:
Tim Moore [Wed, 23 Sep 2009 16:03:00 +0000 (18:03 +0200)]
classes for launching stap and listening for its death
* grapher/grapher.cxx (ChildDeathReader): New class to handle I/O signalling
death of a child.
(GrapherWindow): Inherit from ChildDeathReader.
(StapLauncher): New class for passing arguments to inferior stap process
and checking for its demise.
(main): Move launching logic to StapLauncher.
Tim Moore [Wed, 2 Sep 2009 10:39:05 +0000 (12:39 +0200)]
Add graph data chooser window, based on glade
* configure.ac: Test for libglademm
* grapher/GraphWidget.hxx (DataModelColumns): new class
(onDataDialogCancel, void onDataAdd, onDataRemove, onDataDialogOpen):
new methods
* grapher/GraphWidget.cxx: ditto; methods for the graph data dialog.
* grapher/graph-dialog.glade: New file.
* grapher/graph-dialog.gladep: New file.
* grapher/Makefile.am (dist_pkgdata_DATA): add graph-dialog.glade to
installation.
* grapher/GraphWidget.cxx (GraphWidget constructor): Use PKGDATADIR
Tim Moore [Tue, 4 Aug 2009 21:46:02 +0000 (23:46 +0200)]
more multiple graph fixes
* grapher/Graph.cxx (Graph constructor): set _drawX, _drawY
* grapher/GraphWidget.cxx (addGraph): Fix graph layout
(on_button_press_event): Fix test of play button in multiple graphs
Mark Wielaard [Wed, 30 Sep 2009 14:51:29 +0000 (16:51 +0200)]
PR10678 module reloc refers to symbol in dwarf refer to kernel symbols.
First part of a fix for PR10678. Always include the kernel in the dwfl.
This doesn't seem to impact performance noticable, so for now enable
always.
* setupdwfl.cxx (setup_dwfl_done): New variable, used to clean up logic
in setup_dwfl_report_kernel_p().
(setup_all_deps): New static bool to indicate we want all deps (just
the kernel for now, other modules coming).
(setup_dwfl_report_kernel_p): Use new variables, shortcut kernel
inclusion.
(setup_dwfl_kernel): Setup setup_dwfl_done (false).
build fix: use boost shared_ptr if libstdc++ to old to have <tr1/memory>
* configure.ac: Look for tr1/memory and boost/shared_ptr.hpp
* setupdwfl.h (shared_ptr): Define conditionally based on above.
* systemtap.spec (with_boost): New parameter, default-off.
Josh Stone [Tue, 29 Sep 2009 18:35:18 +0000 (11:35 -0700)]
Add -Werror to tracequery build
The final module build uses -Werror, so tracequery should as well to
catch problems as early as possible. Some ext4 errors have crept in
again (PR10703).
* buildrun.cxx (make_tracequery): Add -Werror to EXTRA_CFLAGS.
Several problems: some invalid <command> etc. directives
in the tapset embedded docs; some analysis about the
non-generation of the pdf; some cleanup of the generated
man pages.
* configure.ac (BUILD_PDFREFDOCS): Correct condition typo, but still
leave disabled.
* doc/SystemTap_Tapset_Reference/Makefile.am (XMLTOMANPARMS): Add,
to disable noise "AUTHORS" / "COPYRIGHT" sections.
* tapset/*.stp: Removed several docbook-y markup that is not valid
in kerneldoc.
Mark Wielaard [Tue, 29 Sep 2009 14:45:37 +0000 (16:45 +0200)]
Cache Dwfl's for reuse between pass 2 and pass 3.
* setupdwfl.h: Introduce DwflPtr.
* setupdwfl.cxx: Cache kernel_dwfl and user_dwfl. Keep track of last used
module strings. Return cached versions if same query used.
* dwflpp.h: Use DwflPtr instead of Dwfl*.
* dwflpp.cxx: Use DwflPtr and don't dwfl_end().
* translate.cxx: Likewise. Run through dwfl_getmodules() with returned
ptr offset.
Mark Wielaard [Tue, 29 Sep 2009 09:22:30 +0000 (11:22 +0200)]
Handle non-regex full path kernel module dwfl setup earlier.
* setupdwfl.cxx (setup_dwfl_kernel(unsigned*,systemtap_session&)):
Don't switch around offline_search_modname and offline_search_names here.
(setup_dwfl_kernel(string&,unsigned*,systemtap_session&): But here.
Josh Stone [Tue, 29 Sep 2009 01:49:51 +0000 (18:49 -0700)]
Simplify copy_file calls
Every single copy_file call we had was converting strings to char*,
printing the same error message, and optionally printing the same
verbose string. Let's canonicalize that.
* util.cxx (copy_file): Take string filenames, add a verbose flag, and
consolidate the message printing.
* cache.cxx (add_to_cache): Pass strings and remove message printing.
(get_from_cache): Ditto.
* main.cxx (main): Ditto.
* tapsets.cxx (tracepoint_builder::get_tracequery_module): Ditto.
(dwarf_cast_expanding_visitor::filter_special_modules): Ditto.
Josh Stone [Tue, 29 Sep 2009 00:36:04 +0000 (17:36 -0700)]
Try to build tracequery for all headers at once
To mitigate PR10424, we switched to building a separate tracequery
module for each tracepoint header, so a bad header wouldn't break all of
the others. However, with recent kernels that leads to ~18 make
commands, which adds up quickly in time. It's cached, so that's not too
bad, but as a developer who rebuilds stap frequently, it gets annoying.
If we're going to call 18 makes, it's worth it to start with one bigger
make that covers all the headers at once (like we used to). If that one
fails, we can still fall back to compiling individually.
FWIW, the failing ext4.h header was only created in 2.6.31, and was
fixed before 2.6.32, so the specific failure in PR10424 has a fairly
small window.
* buildrun.cxx (make_tracequery): Just take a single vector of headers.
* hash.cxx (find_tracequery_hash): Deal with multiple headers.
* tapsets.cxx (tracepoint_builder::get_tracequery_module): Ditto.
(tracepoint_builder::init_dw): Attempt all system headers together,
and if that fails, try again individually.