Mark Wielaard [Thu, 21 Jul 2011 14:51:48 +0000 (16:51 +0200)]
Remove STP_USE_FRAME_POINTER support and merge i386/x86_64 into stack-x86.c
The only real difference between stack-i386.c and stack-x86_64 was that
the former supported a buggy frame pointer based unwind. Which we never
used and for which the kernel has a better fallback (dump_trace).
* testsuite/lib/systemtap.exp (start_server): Locate
stap based on $SYSTEMTAP_PATH; plop in $installed_stap.
(setup_server): Use that location rather than which(1).
Josh Stone [Wed, 20 Jul 2011 22:39:57 +0000 (15:39 -0700)]
Normalize the arch in systemtap_session::clone
* session.cxx (systemtap_session::clone): Normalize the incoming arch
name, so it can be consistently compared to both this->architecture
and other cloned subsessions.
David Smith [Wed, 20 Jul 2011 21:32:30 +0000 (16:32 -0500)]
Improved prcwildcard.exp and cmd_parse.exp tests.
* testsuite/systemtap.base/prcwildcard.exp: If we're testing a stripped
stap, don't bother running the function test, which needs debuginfo.
* testsuite/systemtap.base/cmd_parse.exp: Increase timeout.
* testsuite/lib/systemtap.exp (stripped_p): New function to
determine if an executable is stripped.
Dave Brolley [Wed, 20 Jul 2011 17:46:08 +0000 (13:46 -0400)]
Fix "Unable to shutdown NSS/NSS is not initialized" on RHEL5.
Could also occur for any build with HAVE_NSS && ! HAVE_LIBRPMIO.
In this case, the rpm finder must attempt to shutdown NSS (sometimes initialized
by librpm) without knowing if it was actually initialized. In this case we
will now tolerate failure to shut down NSS if the error is
SEC_ERROR_NOT_INITIALIZED.
Dave Brolley [Wed, 20 Jul 2011 14:37:54 +0000 (10:37 -0400)]
PR 12888 - stap-serverd should be weaned from -k
- stap-serverd no longer passes -k to stap.
- -k specified on client no longer passed on to stap on the server side.
- -k specified to stap-serverd on startup instructs the server to save
its temp dir (contains client request and server response).
- server version 1.6 no longer packs uprobes.ko twice, unless the client
version is < 1.6.
- client version 1.6 looks for uprobes.ko in <response>/stap000000/uprobes
unless server version is < 1.6.
- Update/modify testsuite.
William Cohen [Wed, 20 Jul 2011 14:52:43 +0000 (10:52 -0400)]
Factor out code to normalize the architecture names and add arm arch
A few tests need to know the generic architecture name rather than
the specific variant. This patch factors out the code into
testsuite/lib/systemtap.exp and add entries for the arm architecture
variants.
Mark Wielaard [Wed, 20 Jul 2011 14:05:31 +0000 (16:05 +0200)]
Always look for .note.stapsdt sections in the main elf file.
In dwflpp::iterate_over_notes we really want the actual elf file,
not the dwarf .debug file. Older binutils had a bug where they
mangled the SHT_NOTE type during --keep-debug.
Mark Wielaard [Tue, 19 Jul 2011 20:56:17 +0000 (22:56 +0200)]
Depend on elfutils 0.142+. Remove various workarounds.
We really need at least 0.142 to support quick dwarf unwinding.
Also earlier versions had various bugs that we sometimes worked
around, but not always. Which could lead to misterious failures
when a bias was miscalculated.
David Smith [Tue, 19 Jul 2011 20:15:45 +0000 (15:15 -0500)]
Improve buildid.exp error handling.
* testsuite/systemtap.base/buildid.exp: Once 'error_handler' is called
cleanup has occurred, so the following objcopy commands will fail. Just
return instead.
The PR10854 test case uses a tight loop of staprun and a nexted loop
of pkills, written in a way that counts on staprun's pre-PR12890
"insert; unload; retry insert" module-handling heuristic. With this
heuristic gone (and error messages properly generated), the PR10854
test case goes woozy and hangs in the while { ... pkill ... } tcl
loop. Now we don't loop in there any more.
Mark Wielaard [Mon, 18 Jul 2011 18:55:38 +0000 (20:55 +0200)]
PR10189 and PR12960 reserve system cmd messages for delivery.
runtime/transport/control.c kept one pool for all cmd messages that
the module had to deliver to staprun/io. This pool could become
empty. This meant essential control message would not be delivered.
Leading to the module not properly starting and/or exiting.
We now set aside buffers for one time messages (STP_START, STP_EXIT,
STP_TRANSPORT, STAP_REQUEST_EXIT) and "overflow" messages that get
delivered whenever one of the dynamically allocated messages cannot
get a free slot from the pool (STP_OOB_DATA - warnings and errors,
STP_SYSTEM and STP_REALTIME_DATA).
The type field is used to mark whether or not a special pre-allocated
buffer is currently unused. This needs careful locking using a new
&_stp_ctl_special_msg_lock that is used in the new helper functions
_stp_ctl_get_buffer and _stp_ctl_free_buffer.
Now when we run out of message buffers we just drop the message and
printk. stapio will have received either the one time message or an
overflow message, there is nothing more we can do.
The STP_DEFAULT_BUFFERS for debugfs.c got decreased again to allow
8 pre-allocated and 32 dynamic (pending) cmd messages.
A new testcase testsuite/systemtap.base/warn_overflow.exp was added.
Chris Meek [Mon, 18 Jul 2011 20:39:44 +0000 (16:39 -0400)]
LTTng TMF Custum Text Parser Example
Added proc_snoop_parser to
src/testsuite/systemtap.examples/process/
Follow the instructions in:
src/testsuite/systemtap.examples/process/proc_snoop_parser_instructions.txt
to try out the eclipse plugin tracefile parser.
Mark Wielaard [Fri, 15 Jul 2011 21:54:47 +0000 (23:54 +0200)]
PR12960 Don't msleep in _stp_ctl_send when out of memory.
This is mainly a documentation patch to better explain the transport
layers and the interaction between _stp_ctl_read_cmd, _stp_ctl_send and
_stp_ctl_write.
It also contains the first step to resolve PR12960. The msleep() in
_stp_ctl_send() has been replaced with a loop that checks whether there
are messages on the queue, tries to wake up _stp_ctl_read_cmd so stapio
has a change to read some of the pending messages and a small mdelay
(which is save, because it doesn't actually sleep or schedule). It
only prevents the crash and makes the possibility of loosing control
messages slightly less. A followup patch will introduce special buffers
to hold cannot be lost messages so the module will always be able to
properly shut down.
STP_DEFAULT_BUFFERS for debugfs also got increased a little from 50 to 64.
Josh Stone [Fri, 15 Jul 2011 20:47:45 +0000 (13:47 -0700)]
syscall.*execve: Fix argv access on newer kernels
Kernel commits ba2d0162 and 0e028465, merged in 3.0, refactored the
arguments of do_execve and compat_do_execve, such that "__argv"
is now the name of the incoming pointer, and "argv" is a local
struct user_arg_ptr. Our tapset must adapt to the new names.
* tapset/syscalls.stp (syscall.execve, syscall.compat_execve): Use
@defined to set an internal local __argv to either $__argv or $argv,
then use that for the other __get_argv calls.
* testsuite/buildok/twentyseven.stp: Update for $__argv vs. $argv.
* testsuite/systemtap.base/pointer_array.stp: Ditto.
Josh Stone [Thu, 14 Jul 2011 22:32:42 +0000 (15:32 -0700)]
rhbz717136: Fix SDT relocations in prelinked modules
* tapsets.cxx (sdt_query::handle_probe_entry): The debuginfoless SDT
addresses are relative to the ELF file, so get only that bias. The
DWARF bias is not interesting here.
(sdt_query::setup_note_probe_entry): Add the ELF bias to the semaphore
address too, so record_semaphore can completely relocate it.
(sdt_query::record_semaphore): SDT V3 semaphores need relocation too,
now removing both the bias and prelinking effects.
Petr Muller [Wed, 13 Jul 2011 16:41:47 +0000 (18:41 +0200)]
stap-serverd.cxx: fix memory and resource leaks
While playing with cppcheck tool, I found few resource leaks in stap-serverd.cxx:
- handleRequest: arg was not freed if opening/reading argfile failed
- handleRequest: argfile was not fclosed when reading from it failed
- spawn_and_wait: dotfd was not closed if chdir fails (macro expanded to accomodate resource release)
- spawn_and_wait: cleaned some whitespace up around the fix itself
Josh Stone [Wed, 13 Jul 2011 20:46:29 +0000 (13:46 -0700)]
PR12890 remote: Heed the capabilities of the other side
If the remote side is < 1.6, then it won't know staprun -R, so we'll
have to live without module renaming on that host.
* buildrun.cxx (make_run_command): Add a version parameter, defaulted to
the current VERSION. Don't add -R unless >= 1.6.
* remote.cxx (stapsh::set_child_fds): Save the handshake version.
(stapsh::start): Pass the remote's version to make_run_command.
(ssh_legacy_remote::start): Pass version 1.3 to make_run_command,
treating all "legacy" hosts as somewhat old.
PR12890 cont'd: autoconf elfutils usage in staprun
* configure.ac, Makefile.am: Look for system elfutils.
Check for modern enough version (0.142+), set HAVE_ELF_GETSHDRSTRNDX.
* staprun_funcs.c (rename_module): Conditionally stub out.
* common.c (usage): Conditionally bury -R flag.
* staprun.cxx (init_staprun): Avoid advising people who can't to use -R.
* configure, config.in, aclocal.m4, Makefile.in: Regenerated on F15.
Josh Stone [Tue, 12 Jul 2011 19:08:46 +0000 (12:08 -0700)]
stapsh: Check staprun X_OK and increase error verbosity
* runtime/staprun/stapsh.c (do_run): Explicitly check that we have
execute permissions on staprun before spawning, so we can give a
better error message than just a non-zero status code.
* remote.cxx (stapsh::send_file): Report errors with any verbosity.
(stapsh::start): Report errors with any verbosity, and close handles
on failure so we don't try to wait for further activity.
Lukas Berk [Tue, 12 Jul 2011 19:06:09 +0000 (15:06 -0400)]
PR12729: Improve stap error message
Now report when the user doesn't have permission to run staprun
or if posix_spawnp is unable to launch the process
remote.cxx - finish now reports failure to launch
trycatch.exp - account for the new warning
util.cxx - report if staprun isn't executable or if stap_waitpid failed
Stan Cox [Tue, 12 Jul 2011 02:25:42 +0000 (22:25 -0400)]
PR6954 Add a used variables set for use by automatic global printing.
* staptree.h (varuse_collecting_visitor::used): New.
* staptree.cxx (varuse_collecting_visitor::visit_symbol): Use previous
method for setting read and write sets. Also set used set.
* elaborate.cxx (add_global_var_display): Use the used set.
* global_end.exp (global_end_var): Initialize to non-zero values.
* global_end.stp (global_end_var): Likewise.
- Use systemtap version number as the client/server protocol version
number.
- Client and Server will both be backward compaible with the other
by policy. i.e. both will adapt when connecting to a down-level
version of the other.
Mark Wielaard [Fri, 8 Jul 2011 14:53:10 +0000 (16:53 +0200)]
unwind.c consolidate sanity checking and cie/fde parsing.
Checking and parsing of FDE and CIE data was done in multiple places in
the code, some things were checked/parsed multiple times. Now is_fde()
and cie_for_fde() do the structural/id/version checking of FDEs
and CIEs. parse_fde_cie() parses the content of and checks the internal
consistency the CIEs and FDEs.
unwind.h: Removed now unused fields of struct unwind_state.
And removed unnecessary static function prototype declarations.
Josh Stone [Fri, 8 Jul 2011 01:26:28 +0000 (18:26 -0700)]
serverd: Fix the locale regex to allow '-' and not a range
[.-=] allows all characters from '.' to '=', 0x2E-0x3D
[.=-] allows exactly the characters '.' '=' '-'
The server_locale.exp test is supposed to check that '-' is allowed,
but its failure was incorrectly masked in commit 16560657. Added a
different disallowed character ';' to test as well.
Josh Stone [Tue, 21 Jun 2011 18:12:36 +0000 (11:12 -0700)]
PR5163: Cache uprobes.ko as we do with everything else
We now build uprobes in our writable tmpdir (rather than directly in
SYSTEMTAP_RUNTIME), and cache the result for reuse. This relieves the
pain of having to rebuild uprobes after every kernel change, and also
makes it possible to provide uprobes for multiple unique targets, as
needed for the compiler server and for remoting.
* buildrun.cxx (make_make_cmd): New, consolidate repeated code.
(compile_pass, make_tracequery, make_typequery_kmod): Use it.
(make_uprobes): Rewrite to build uprobes.ko under the tmpdir, just
using a #include to the main uprobes.c in the runtime.
(get_cached_uprobes, set_cached_uprobes): Read/write uprobes cache.
(uprobes_pass): Try to cache, then build if necessary.
(may_build_uprobes, verify_uprobes_uptodate, copy_uprobes_symbols):
Removed.
* hash.cxx (find_uprobes_hash): Prepare a hashed name for uprobes.
* main.cxx (passes_0_4): If the script was cached, make sure we still
find or build uprobes if needed too.
* stap-serverd.cxx (handleRequest): Get uprobes from the tmpdir rather
than from the runtime path, and sign it directly if needed.
* testsuite/lib/systemtap.exp (uprobes_p): Don't build uprobes here.
* testsuite/systemtap.base/buildid.exp: Launch a dummy pass-5 run, so we
don't have to worry about providing a path to staprun -u.
Stan Cox [Fri, 8 Jul 2011 01:41:14 +0000 (21:41 -0400)]
PR6954 Do automatic global printing for RMW operands.
* staptree.h (current_lvalue_read): New.
* staptree.cxx (varuse_collecting_visitor::visit_symbol): Do not
treat an RMW symbol as read if the value is not a real rvalue.
(varuse_collecting_visitor::visit_print_format): Handle current_lvalue_read.
(varuse_collecting_visitor::visit_assignment): Likewise.
(varuse_collecting_visitor::visit_delete_statement): Likewise.
* global_end.exp (global_end_var): Test RMW cases.
* global_end.stp (global_end_var): Likewise.
Mark Wielaard [Wed, 6 Jul 2011 21:07:51 +0000 (23:07 +0200)]
Silence sys/sdt.h comparison of unsigned expression < 0 is always false.
Some arm g++ setups would complain about the wchar_t "signedness detection":
sys/sdt.h:102: error: comparison of unsigned expression < 0 is always false
jistone said: "((T)(-1) < 1)" would still get the right boolean value,
and shouldn't trigger range errors like "unsigned is never < 0".
William Cohen [Wed, 6 Jul 2011 14:06:42 +0000 (10:06 -0400)]
PR12947 Properly track the creation of probes using hardware breakpoints
The logic for creation of probes using hardware breakpoints was incorrect.
The register_wide_hw_breakpoint() can return error codes in place of the
pointer. The value of the pointer needs to be checked to determine
whether it is the range of values indicating an error. The setup
code also needs to properly track whether the registration was completed
on a probe, so the shutdown code only calls unregister_wide_breakpoint()
for the probes that are actually registered.
Mark Wielaard [Tue, 5 Jul 2011 15:25:23 +0000 (17:25 +0200)]
unwind.c Only do a linear search if there isn't a search header.
There always should be one, we create it in the translator
if it didn't exist. Only if we are using elfutils < 0.142
should these ever be missing. If the binary search fails,
either because the unwind data is bad, or the address isn't
covered, don't fall back to linear searching. Add extra
checks and warning about bad debug frame header.
Mark Wielaard [Fri, 1 Jul 2011 17:33:02 +0000 (19:33 +0200)]
unwind.c fix CIE augmentation parsing.
fde_pointer_type and unwind_frame didn't handle the "S" augmentation
properly. This augmentation doesn't cary any extra data and so does
not need to be preceded by "z". Also added more sanity checking,
plus _stp_warn explanations when things go wrong/doesn't parse.
Mark Wielaard [Thu, 30 Jun 2011 12:14:14 +0000 (14:14 +0200)]
Handle CIE and FDE CFI sequentially in unwind.c instead of recursively.
Another simple unwind.c cleanup to make reasoning about the unwind state
easier, and to remove another (useless) recursive call. Also fixes CIE/FDE
comment mixup.
Mark Wielaard [Wed, 29 Jun 2011 15:20:30 +0000 (17:20 +0200)]
Refactor DW_CFA_remember/restore_state handling in unwind.c.
The old way of handling DW_CFA_restore_state involved recursively
calling processCFI() to replay the whole CFI stream up till the last
DW_CFA_remember_state instruction. This made it hard to reason about
the actual unwind state and could lead to processing the CFI stream
multiple times. In exchange for a little extra memory allocated upfront
to keep a small stack of register states we now just process all CFI in
one go. This change also splits out the unwind_reg_state from the
general unwind_state struct.
William Cohen [Wed, 29 Jun 2011 19:32:41 +0000 (15:32 -0400)]
Properly report the return code (rc) for registered hw breakpoint (PR12947)
In the case of an error, the return code (rc) was not set when
registering a hw breakpoint. This lead to the the SystemTap instrumentation
module trying to unregister a non-existant hw breakpoint and causing a
kernel oops. This patch ensure that the return code (rc) is set
and errors in hw breakpoint registration are handled correctly.
Dave Brolley [Tue, 28 Jun 2011 21:33:48 +0000 (17:33 -0400)]
check_groups cleanup in staprun.
- Remove unused extern "unprivileged_user'
- Correct error messages about which group memberships are necessary.
- Messages did not specify that stapusr is required if not root.