Probing a process with corrupted DWARF information, it has been
possible to create a kernel-side divison-by-zero. This fixes.
Handle DW_OP_div/mod divide by zero. DW_OP_mod should work unsigned.
* loc2c.c (translate): Use helper functions div_op and mod_op for
DW_OP_div and DW_OP_mod operands. Set used_deref = true.
* translate.cxx (translate_runtime): Emit STAP_MSG_LOC2C_03 define.
* runtime/loc2c-runtime.h: Define dwarf_div_op and dwarf_mod_op macros.
* runtime/unwind.c (compute_expr): Check for zero before executing
DW_OP_mod or DW_OP_div.
Lukas Berk [Wed, 11 May 2011 19:20:48 +0000 (15:20 -0400)]
Gettext a few lines, update /po and Makefiles
Makefile.am - dont include config.h runtime/staprun/config.h or git_version.h
Makefile.in - likewise
nsscommon.cxx - gettexted a few strings
po/* - regenerated files
David Smith [Wed, 11 May 2011 17:52:34 +0000 (12:52 -0500)]
Merge branch 'master' of git://oss.sgi.com/nathans/systemtap
* 'master' of git://oss.sgi.com/nathans/systemtap:
Resolve a couple of issues from missing log file handling code.
Minor cleanups after writing QA tests.
Add code to deal with log file rotation.
Unify separate tables for tracking log files, simpler code.
Several additional metrics for log file PCP agent.
Update the domain number comment now that one is reserved.
Uncomment the seek-to-end-of-log-file code in pmdalogger.
Remove further shared library remnants in pmdalogger build.
Frank Ch. Eigler [Wed, 11 May 2011 17:40:53 +0000 (13:40 -0400)]
dtrace python i18n: make work with autoconf wackyness
autoconf likes to expand some @vars@ in terms of shell-script-like
constructs like @LOCALEDIR@ = "${datarootdir}/locale" and
@datarootdir@ = "${prefix}/share". Since python doesn't interpolate
strings the same way as /bin/sh, this no workie. So we hard-code the
interpolation with a sequence of string.replace calls.
Confirmed working with LANG=fr_FR strace python ./dtrace |& grep /fr
* configure.ac: AC_SUBST a few more values.
* dtrace.in: Specially process ENABLE_NLS and similar values.
Josh Stone [Wed, 11 May 2011 03:01:01 +0000 (20:01 -0700)]
remote: Add tests for manually-specified hosts
To run a basic test on hosts foo and bar, use:
make installcheck RUNTESTFLAGS=remote.exp TESTREMOTES=foo,bar
* testsuite/systemtap.base/remote.exp: New test of --remote hosts.
* testsuite/systemtap.base/remote.stp: New.
* testsuite/Makefile.am: Add TESTREMOTES control of remote.exp.
* testsuite/Makefile.in: Regenerate.
Josh Stone [Wed, 11 May 2011 00:33:59 +0000 (17:33 -0700)]
Create a signal-safe type for tracking spawned pids
* util.cxx (spawned_pids_t): New type which wraps a set<pid_t>, masking
signals on each access to ensure consistency in and out of the signal
handler. The spawned_pids global is the only instance.
(stap_waitpid): Use !contains(pid) rather than count(pid)==0.
(kill_stap_spawn): Use spawned_pids_t::killall().
Josh Stone [Wed, 11 May 2011 00:25:47 +0000 (17:25 -0700)]
Consolidate signal-masking into a utility class
* util.h (stap_sigmasker): New, masks our usual signals for the life of
the stap_sigmasker object.
* remote.cxx (direct_stapsh::direct_stapsh): Use stap_sigmasker while
spawning the stapsh child process.
(ssh_remote::connect): Ditto for the ssh process.
(remote::run): Use stap_sigmasker around the polling loop.
Josh Stone [Tue, 10 May 2011 23:45:54 +0000 (16:45 -0700)]
remote: Disambiguate the private target names
It's conceivable, however unlikely, that a user may have an actual host
named "direct" or "stapsh", which would conflict with our internal
methods if used as a --remote. Such a user could say "ssh://direct" to
be explicit, but we can also hide ours a little better. Those internal
names are now tested as proper URI schemes, e.g. "direct:...", so they
should never conflict with a user's legitimate target.
* remote.cxx (remote::create): Test for "direct" and "stapsh" only as
the scheme of a decoded URI.
* main.cxx (main): Use "direct:" for non-remote use.
* testsuite/systemtap.base/stapsh.exp: Use "stapsh:" for testing.
Josh Stone [Tue, 10 May 2011 22:03:02 +0000 (15:03 -0700)]
PR12749: Replace popen calls with stap_spawn_piped
The new form has the advantages that child processes are managed by
signals to stap, and that arguments are provided in a vector so they
don't need to be escaped.
* dwflpp.cxx (dwflpp::iterate_over_libraries): Convert popen call to
stap_spawn_piped, followed by fdopen so the same FILE* operations are
still supported. Finish with fclose+stap_waitpid instead of pclose.
* tapsets.cxx (symbol_table::read_from_elf_file): Ditto.
Josh Stone [Tue, 10 May 2011 22:00:58 +0000 (15:00 -0700)]
Remove the unused git_revision()
The use of this function had been commented out for some time now, and
it contained an unescaped call to popen. Rather than trying to fix dead
code, just remove it altogether.
Dave Brolley [Tue, 10 May 2011 18:37:17 +0000 (14:37 -0400)]
Systemtap Compile Server Integration (rewrite):
- Rewrite stap-serverd in C++
- Rewrite related tools (stap-gen-cert, stap-authorize-cert, stap-sign-module)
in C++. Integrate functionality into stap-serverd.
- Remove stap-server-connect (integrated into stap-serverd).
- Move all common NSS related code into nsscommon.cxx (renamed from nsscommon.c).
- Rename modsign.cxx to stap-sign-module.cxx.
- Update test suite with new expected messages.
- Update man pages.
- Remove obsolete tools (scripts).
- Remove test for certutil from configuation.
Josh Stone [Fri, 6 May 2011 23:31:08 +0000 (16:31 -0700)]
uprobes: impedance match insn tables with test_bit()
The kernel's test_bit expects its bitmap to be const volatile, but we
had ours as simply const. On Fedora 15 with gcc 4.6, compiling uprobes
gave a few warnings like this:
arch/x86/include/asm/bitops.h:319:2: warning: use of memory input
without lvalue in asm operand 1 is deprecated [enabled by default]
That line is the asm statement in variable_test_bit().
The symptom noticed was that handle_riprel_insn was reading need_modrm:0
for opcode 0x89, when our table says it should be 1. Who knows what
other havok ensued...
When our instruction tables are set const volatile to match test_bit(),
the warning goes away, and need_modrm is now computed correctly.
Stan Cox [Fri, 6 May 2011 20:00:11 +0000 (16:00 -0400)]
Use iterate_over_libraries for --ldd instead of invoking ldd
* dwflpp.cxx (iterate_over_modules): Make data an opaque type.
(iterate_over_libraries): Likewise.
* dwflpp.h: Likewise.
* translate.cxx (add_unwindsym_iol_callback): New.
(query_module): New.
(add_unwindsym_ldd): Use them to iterate_over_libraries instead of ldd.
* library.exp: Add --ldd test.
Josh Stone [Fri, 6 May 2011 07:18:06 +0000 (00:18 -0700)]
dtrace: Push main logic into an actual main()
* dtrace.in (main): New, invoked using Python's __name__ idiom.
(_provider.__typedef_append): Take add_typedefs as a parameter rather
than pulling from the formerly global scope.
(_provider.generate): Adjust __typedef_append calls.
Nathan Scott [Fri, 6 May 2011 03:45:46 +0000 (13:45 +1000)]
Add code to deal with log file rotation.
Depending on how the log file is rotated, it will often end up backed
by a new inode (new log file) as the previous one is renamed (usually
with yesterdays log timestamp). When this happens, we'll stop getting
events ... unfortunate.
So, make use of the stat data queried at start of fetch now to ensure
we detect this scenario, close the old file and switch to the new one
seamlessly. In the process, we need to make sure we handle the case
where the file doesn't exist.
Lukas Berk [Tue, 3 May 2011 19:21:10 +0000 (15:21 -0400)]
PR12508 eventcounting script
An eventcount.stp script has now been added which allows for event
counting in the format of 'stap eventcount.stp syscall.* process.end ...'
with a printout of tid's event and count.
Nathan Scott [Tue, 3 May 2011 11:28:06 +0000 (21:28 +1000)]
Several additional metrics for log file PCP agent.
Add in counts of bytes and events seen per logfile, and the
current log file size for each. This will aid debugging and
day-to-day monitoring of more complex, hierarchical (r)syslog
deployments, for example.
Added a fetch callback so initial code to check status of the
monitored log files could be added - will need to extend this
though, to deal with the common case of logfile rotation.
Minor cleanups - pmid_string field is not a string, rename;
do not explicitly initialise global variables to zero, as the
compiler will happily do that; metric semantics for a couple
of metrics not-quite-right (discrete vs instant).
William Cohen [Mon, 2 May 2011 21:39:07 +0000 (17:39 -0400)]
Generate better file name for Tapset html
As a default xmlto uses reXXXX.html for file names for the generated html files.
This has two drawbacks: it is not meaningful to humans and is likely to change
between builds of the tapset reference manual. The added XMLTOHTMLPARAMS
option generate more nmemonic names for the files and generates files that
are less likely to change between builds.
Nathan Scott [Mon, 2 May 2011 07:27:17 +0000 (17:27 +1000)]
Uncomment the seek-to-end-of-log-file code in pmdalogger.
For real world uses, we need to do this so that even moderately
sized log files are not read from start to end whenever the PMDA
starts up - wastes CPU cycles and (re)generates events that may
have happened long ago.
Josh Stone [Fri, 29 Apr 2011 23:33:35 +0000 (16:33 -0700)]
remote: Normalize the shell used to invoke stapsh
We're using a little bit of shell magic on the remote side to tell
whether a system has stapsh available, but that magic depends on
specific shell functionality. We now explicitly start it with
/bin/bash, so the user's choice of $SHELL doesn't matter.
* remote.cxx (ssh_remote::connect): Wrap the command in /bin/bash -c.
Josh Stone [Fri, 29 Apr 2011 22:48:13 +0000 (15:48 -0700)]
Loop server requests over every remotes' session
Now the list/trust-server options work on every unique session generated
by the requested remote targets. This enables usage such as:
$ stap --remote HOST --list-servers
* main.cxx (passes_0_4): Move the version/session banner to main.
(main): Print the version just once, and the session banner for each
unique session. Perform server actions within the session loop.
Josh Stone [Fri, 29 Apr 2011 22:43:14 +0000 (15:43 -0700)]
Restore signals after removing the tmpdir
This is necessary because passes_0_4_again_with_server() will remove the
tmpdir before trying again on a server. If it is then successful, the
proceeding run should be interruptible as normal.
* main.cxx (remove_temp_dir): Restore signals after removal.
Josh Stone [Thu, 28 Apr 2011 23:35:52 +0000 (16:35 -0700)]
remote: Support ssh on custom ports
* remote.cxx (remote::create): Heuristically disambiguate between URI
scheme:path and SSH host:port.
(ssh_remote::create): Parse an optional :port in the host string.
(ssh_remote::connect): Pass the port to ssh -p.
(ssh_legacy_remote::open_control_master): Pass the port to both
ssh -p and scp -P.
Josh Stone [Wed, 27 Apr 2011 22:23:05 +0000 (15:23 -0700)]
remote: Allow prefixing lines with the host index
When --remote-prefix is given, each line from remote scripts will be
prefixed with "N: ", where N is the index of that host among all the
--remote options on the command line.
* remote.h (class remote): Add an optional prefix string.
* remote.cxx (remote::run): Set the prefix if desired.
(stapsh::handle_poll): When using prefixes, read data line-wise.
* session.cxx (systemtap_session): Add bool use_remote_prefix.
(systemtap_session::parse_cmdline): Set it with --remote-prefix.
Stan Cox [Wed, 27 Apr 2011 02:29:24 +0000 (22:29 -0400)]
tolerate kernel builds with missing .note.gnu.build-id sections
* translate.cxx (dump_unwindsyms): Add back the check "Don't save
build-id if it is located before _stext."
* runtime/sym.c (_stp_module_check): Don't do buildid check for
-DSTP_NO_BUILDID_CHECK.
(_stp_usermodule_check): Likewise.
* buildid.exp: Test -DSTP_NO_BUILDID_CHECK
David Smith [Tue, 26 Apr 2011 18:56:30 +0000 (13:56 -0500)]
Added 'logger.perfile.{LOGFILE}.path' metric in the logger PMDA.
* pcp/src/pmdas/logger/logger.c (logger_fetchCallBack): Add support for
new 'logger.perfile.{LOGFILE}.path' metric, which returns the logfile
pathname.
David Smith [Tue, 26 Apr 2011 15:00:41 +0000 (10:00 -0500)]
Added help text for the dynamic metrics in the logger PMDA.
* pcp/src/pmdas/logger/logger.c (logger_text): New function that provides
help for the dynamic metrics.
(logger_init): Set up dynamic metrics help text.
* pcp/src/pmdas/logger/help: Removed commented out help for the dynamic
metrics.
David Smith [Mon, 25 Apr 2011 17:46:01 +0000 (12:46 -0500)]
Fix setjmp.exp and server tests for RHEL5 systems.
* testsuite/lib/systemtap.exp (start_server): To properly support older
versions of tcl (as on RHEL5 systems), don't use the newer form of
'catch'. Instead, use the global 'errorCode' variable when getting the
exit code of a child process.
* testsuite/systemtap.base/setjmp.exp: Ditto.
David Smith [Thu, 21 Apr 2011 18:45:46 +0000 (13:45 -0500)]
Added more testsuite error handling.
* testsuite/lib/systemtap.exp (start_server): Catch errors when starting a
server.
* testsuite/systemtap.base/buildid.exp: Be sure to call wait after spawn.
David Smith [Tue, 19 Apr 2011 21:41:47 +0000 (16:41 -0500)]
Improved error handling in uprobe probes.
* tapsets.cxx (uprobe_derived_probe_group::emit_module_decls): Make sure
we go through the probe epilogue even in an error in the emitted
enter_uprobe_probe().
In the generated module_exit() function, there exists an infinite loop
that waits for all probe handlers that might still be struggling along
to shut down, before authorizing removal of the probe module. If
something is stuck though, it will stay stuck without any diagnostics
short of a hung stapio/staprun process that's sitting in the kernel.
This patch causes a message to be printk KERN_ERR'd, in case the
shutdown synchronization takes more than a second.
* translate.cxx (emit_module_exit): Print a message if holdon-spinning
for more than a second.