loc2stap.cxx: Add partial support for DW_OP_bra in DWARF location lists
Add support for DW_OP_bra when operand is non-negative. Previously
systemtap would quit probe translation if DW_OP_bra was seen in a
DWARF location list.
Tested manually on RHEL 8.9 with kernel 4.18.0-513.24.1.el8_9.x86_64.
Scripts containing a vfs.read probe require DW_OP_bra support when
run with this kernel.
Support for DW_OP_bra negative operands continues to be deferred due
to lack of use as well as being more complex to implement.
William Cohen [Tue, 23 Apr 2024 16:11:54 +0000 (12:11 -0400)]
Update the include files with exit reasons in kvm_service_time.stp
The Linux git commit af170c5061dd moved the location of the include
files with the SMV_EXIT_* and EXIT_REASON_* defines from
linux/arch/x86/include/asm to linux/arc/x86/include/uapi/asm. The
kvm_service_time.stp has been updated to print out the current include
files names so a user of the kvm_service_time.stp example script has
an easier time mapping the exit reason numbers reported to the
defines.
William Cohen [Wed, 17 Apr 2024 14:08:52 +0000 (10:08 -0400)]
Make probing NFSD V2 probe points optional in buildok/nfsd-detailed.stp test
Newer kernels have removed NFSD V2 suport (CONFIG_NFSD_V2 is not set).
The nfsd.proc2.* probes need to be made optional as those probe
points are not available.
William Cohen [Wed, 17 Apr 2024 13:48:40 +0000 (09:48 -0400)]
Update the nfs.stp tapset for NFS folio support
The addition of folio support to NFS in the Linux kernel has changed
some of the functions that are available for NFS operations. Probes
for those new functions (nfs_read_folio and nfs_readahead) were added.
The nfs tapset has to be a bit more flexible in which functions are
available and the probes are optional to allow wildcards to continue
to work. The way count were obtained for nfs.fop.read_iter and
nfs.fop.write operations were also updated.
William Cohen [Fri, 19 Apr 2024 13:58:10 +0000 (09:58 -0400)]
Use different kernel code to exercise functioncallcount.stp example
Switching the functioncallcount.stp example from counting functions in
the memory management subsystem ("*@mm/*.c") to counting functions in
the file system ("*@fs/*.c"). On some machines such as ppc64 and
arm64 wholesale probing all the functions in memory managent subsystem
has been problematic.
William Cohen [Tue, 16 Apr 2024 16:46:39 +0000 (12:46 -0400)]
Avoid redefinition of S390 PSW_ADDR_AMODE and PSW_ADDR_INSN in newer kernels
Only define PSW_ADDR_AMODE and PSW_ADDR_INSN if they are undefined.
The following Linux kernel git commit (b8af5999779d1) moved the definition of
PSW_ADDR_AMODE and PSW_ADDR_INSN from arch/s390/include/uapi/asm/ptrace.h
to arch/s390/include/asm/ptrace.h causing an error as both runtime/regs.h and
arch/s390/include/asm/ptrace.h headers were defining them:
Author: Heiko Carstens <hca@linux.ibm.com> 2023-06-21 07:35:43
Committer: Alexander Gordeev <agordeev@linux.ibm.com> 2023-07-03 05:19:39
Parent: 6376402841e1fa6f1c5b7604abc9c746a84c715a (s390/ptrace: remove PSW_DEFAULT_KEY from uapi)
Child: b378a982614360686f45c3e6b63fd5d1acd02d08 (s390: include linux/io.h instead of asm/io.h)
Branches: master, remotes/origin/master
Follows: v6.4
Precedes: v6.5-rc1
s390/ptrace: make all psw related defines also available for asm
Use the _AC() macro to make all psw related defines also available for
assembler files.
Acked-by: Alexander Gordeev <agordeev@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
William Cohen [Wed, 3 Apr 2024 17:51:27 +0000 (13:51 -0400)]
Update nfsd.stp tapset to work with Linux 6.1 and newer kernels
The kernel git commit 0cfb0c4228a5c8e2 uses the
DEFINE_PROC_SHOW_ATTRIBUTE macro which creates another function
nfsd_open() in fs/nfsd/stats.c in addition to nfsd_open() in
fs/nfsd/vfs.c. This new function has a completely different purpose
and different set of arguments than the original. When the nfsd.stp
tapset tries to instrument this new nfsd_open() it cannot find the
expected arguments. Systemtap scripts probing the nfsd_open()
function such as the nfsd-recent.stp example failed to build as a
result. The tapset now restricts the probing to
nfsd_open@fs/nfsd/vfs.c, the original nfsd_open() function.
William Cohen [Sun, 31 Mar 2024 00:56:36 +0000 (20:56 -0400)]
Update memory.stp tapset to allow the vm.tracepoints.stp example to work
Due to kernel git commits 71baba4b92dc1 (renaming __GFP_WAIT to
__GFP_RECLAIM) and 2c1d697fb8ba6 (changing the kmem_cache_alloc
tracepoint arguments) the memory.stp tapset needed some adjustments to
enable the SystemTap tracepoints.stp example to continue to work with
newer kernels.
William Cohen [Fri, 29 Mar 2024 18:28:21 +0000 (14:28 -0400)]
Adjust hugepage_cow_delays.stp to work with newer kernels
Kernel git commit c0e8150e144b6 changed the function handling the
copy-on-write operations for hugepages from copy_user_huge_page to
copy_user_large_folio. Made hugepage_cow_delays.stp use the new
function name when it is available.
William Cohen [Fri, 22 Mar 2024 14:29:49 +0000 (10:29 -0400)]
PR31500: Never allow probing of kernel __init or __kprobes functions
When guru mode was used it was possble to get systemtap to instrument
kernel functions marked with __init or __kprobes. By the time that
systemtap instrumentation is being loaded a kernel __init marked
functions has already run and may be in a section of memory that has
been freed up. At best this probe will never trigger. At worst the
registration of the probe will cause a memory fault causing the
process to be killed. Also probes shouldn't be allowed on __kprobes
functions as a rule.
William Cohen [Wed, 20 Mar 2024 14:24:53 +0000 (10:24 -0400)]
Remove unneeded guru mode option from poll_map.exp
Guru mode should only be used when it is really needed to allow the
systemtap script change program state or disable some safety check or
black list exclusions. With guru mode enabled on a particular machine
this test would attempt to probe
kernel.function("vfs_caches_init").call, an initialization function on
a page that would would later freed. The script would get page fault
when attempting to install the kprobe for this function.
William Cohen [Tue, 19 Mar 2024 20:09:52 +0000 (16:09 -0400)]
Allow systemtap --target-namespace=PID option to work with Intel IBT
On Intel systemtap with IBT the systemtap runtime code to implement
--target-namespace=PID would cause a trap to occur. The runtime
indirect calls are now properly wrapped and will execute without issue
on machines supporting Intel IBT.
There are other things that objtool is doing in addition to checking
user accesses and disabling objtool with newer RHEL9
5.14.0-428.el9.x86_64 causes the system to reboot when setting up some
tracepoint probes (PR30472).
William Cohen [Thu, 7 Mar 2024 18:44:06 +0000 (13:44 -0500)]
PR30716: Turn off objtool warnings on systemtap instrumentation modules
The previous approaches to turning off the objtool warnings did not
work for x86_64 RHEL9. The systemtap generated code is not on the
whitelist to use certain kernel functions. The additional objtool
warning output mentioning the systemtap code using those functions
with UACCESS enabled caused a number of the tests in the testsuite to
fail. The generated Makefile now includes a line to turn off running
objtool on the systemtap generated module and eliminates those
warnings.
William Cohen [Mon, 4 Mar 2024 21:27:18 +0000 (16:27 -0500)]
PR31117: Correct handling of transport layer allocated memory
The _stp_print_flush() code was not correct. There are four possible
ranges of values compared to the header size (hlen)
_stp_data_write_reserve() could return when beginning to write out
log:
<0 unable to allocate any space
<hlen pad out the allocated space and try another allocation
==hlen just enough space for the initial header
>hlen write out the header and some portion of log
The case where the space allocated was equal size of the header
(==hlen) was not handled correctly. In the cases where there was only
enough room to write the header the _stp_transport_failures variable
was incremented and none of the log data was written out. The correct
course of action in these cases would be to write the header out in
the allocated space and start looping to write the rest of the log
data.
William Cohen [Wed, 28 Feb 2024 15:43:51 +0000 (10:43 -0500)]
PR31404: Make tracepoint queries work with gcc14
The Fedora rawhide Linux 6.8 kernels are built with gcc14 and include
-Wmissing-prototypes in the CFLAGS options. When building the
kernel modules to query the available tracepoints errors occur
resulting in kernel tracepoints being found. The fix is to
include a function declaration before the function definition
in the DECLARE_TRACE macro.
William Cohen [Tue, 20 Feb 2024 14:22:34 +0000 (09:22 -0500)]
Get SHM_* flag defines from the appropriate include file for Linux 6.8 kernels
Linux git commit bc46ef3cea3d6f6 removed the include/uapi/linux/shm.h
from include/linux/shm.h. For the newer Linux 6.8 kernels need to get
SHM_* defines directly from include/uapi/linux/shm.h.
William Cohen [Thu, 15 Feb 2024 20:01:53 +0000 (15:01 -0500)]
PR19360: Correct lwtools fslatency-nd.stp and fsslower-nd.stp
Reviewed examples to ensure that the entry value for a function
argument is used for function return probes. Found that
__vfs_write.return probes aliases were missing ".return" and needed an
@entry() for the argument fetch in fslatency-nd.stp and
fsslower-nd.stp.
William Cohen [Wed, 14 Feb 2024 14:33:30 +0000 (09:33 -0500)]
PR31373: Deal with the removal of strlcpy() from linux 6.8
The Linux 6.8 kernels removed strlcpy() with git commit d26270061a in
January 2024. All the kernel's strlcpy() uses were converted to
strscpy(). Systemtap needed to do the same. This is implemented in
systemtap with a strlcpy macro in the runtime that translates the
strscpy() return value into the equivalent strlcpy() value.
William Cohen [Mon, 5 Feb 2024 19:37:18 +0000 (14:37 -0500)]
Update the aux_syscall.stp tapset to directly include <uapi/linux/wait.h>
The linux kernel git commit 6dfeff09d5ad33190 removes the include for
<uapi/linux/wait.h> from <linux/wait.h>. The kernel has had
<uapi/linux/wait.h> header for over a dozen years (kernel git commit 607ca46e97a1b65) and systemtap should just use that directly. The
downside of this change is that systemtap will require a Linux 3.7 or
newer kernel.
William Cohen [Thu, 1 Feb 2024 18:31:43 +0000 (13:31 -0500)]
Fix tast_start_time for newer kernels
Kernel git commit cf25e24db61cc9d renames real_start_time member of
the task_struct to start_boottime. The task_start_time function needs
to be adjusted to handle this new name.
a) fopen@@GLIBC_2.2.5 exists in the updated symtab
b) fopen does not exist in the updated symtab
This PR is to add a version info padding when symbol cannot be found in
the updated symtab, so systemap can support searching symbol aliases like
this:
$ stap -L 'process("/lib64/libc.so.6").function("fopen")'
# And with wildcard, like this
$ stap -L 'process("/lib64/libc.so.6").function("fo*en")'
Frank Ch. Eigler [Thu, 25 Jan 2024 19:28:38 +0000 (14:28 -0500)]
PR31288: build with gcc14
GCC14 makes -Wmissing-prototypes defaultish on, which triggers on such
gentle-spirited code as:
void foo(void) { }
when you should darn well know to have an exact duplicate declaration
prototype first. Because of course.
void foo(void);
void foo(void) { }
So anyway, with our fondness for -Werror, this broke the stap runtime
autoconf* business, bits of the runtime, bits of the translator.
Probably more stuff as yet unidentified. If your testsuite logs show:
[...]: error: no previous prototype for ‘[...]’ [-Werror=missing-prototypes]
this is probably to blame.
Since this is coming to clang as well, we now get buildrun.cxx to
force -Wmissing-prototypes on all the time, so as to try to notice
occurrences of this problem earlier.
Martin Cermak [Thu, 25 Jan 2024 10:46:50 +0000 (11:46 +0100)]
PR26843: print_ubacktrace_fileline() fails with PIE binaries
Ubuntu has it's GCC configured with --enable-default-pie. The
binaries it's producing by default are DYN (Position-Independent
Executable file). This isn't reflected in the producer record.
For processing PIE binaries, additional relocation is needed in
the stap runtime.
Martin Cermak [Thu, 25 Jan 2024 10:37:07 +0000 (11:37 +0100)]
Fix width of _stp_filename_lookup_5's offset to .debug_line_str
The read_pointer( ... DW_EH_PE_data4 ... ) gives a 4 byte value.
Elfutil's readelf.c does this with read_4ubyte_unaligned_inc().
Adjust the storage width for such offset to prevent overflow.
Problem demonstrated with context.exp / symfileline.tcl.
William Cohen [Tue, 23 Jan 2024 18:09:45 +0000 (13:09 -0500)]
PR31117: Eliminate some transport failures
The headers for messages cannot span subbuffers. Depending on the
previous messages the remaining space left in a subbuffer may to too
small for a header. The code would give up and drop that particular
message when the code found that there was not enough space to write
the entire header. The revised code now zeros out that small block
allocated and tries again to allocate buffer space for the message
before giving up. In the common case the next subbuffer has plenty of
space for the header and the rest of the message.
Martin Cermak [Mon, 22 Jan 2024 14:26:59 +0000 (15:26 +0100)]
Fix gates emitting .debug_line_str into stap_symbols.c
The symfileline subtest of context.exp pointed a finger to situation
where debug_line_str data weren't emitted to stap_symbols.c while they
were needed. Align the conditions for emission of .debug_line and
.debug_line_str with conditions for their use.
Martin Cermak [Mon, 22 Jan 2024 13:57:13 +0000 (14:57 +0100)]
testsuite: Update context.exp
This improves test results of context.exp. First it makes the
two .ko's buildable with modern kernels. The module code seems
to be based on LTP's crasher testcase, and the source should be
usable with 2.6.32 onwards.
I've been facing a problem where systemtap_test_module1 was unable
to see symbols exported by systemtap_test_module2 via EXPORT_SYMBOL
Building both modules using one single makefile seems to solve the
problem. It also simplifies the testcase. The makefile2 isn't needed.
Adjust timing in the .tcl files so that expect gates work as needed.
This timing update alread was in symfileline.tcl (kludge warning) and
this timing update now is in all the tcl files.
Tim Haines [Thu, 18 Jan 2024 19:33:44 +0000 (14:33 -0500)]
PR31242: Support namespaced include directories in Dyninst
Dyninst now has some namespaced include directories (e.g.,
dyninst/registers/MachRegister.h). This requires adding the top-level
include to the compile flags to find the headers in the flat namespace
included from a namespaced directory.
Martin Cermak [Mon, 8 Jan 2024 22:12:42 +0000 (23:12 +0100)]
testsuite: Fix syscall test runtime problems
This update is mostly addressing SEGVs in syscall tests
compiled as a 32 bit binary on x86_64. The 32-bit syscall
wrappers typically convert one structure to another by
dereferencing individual struct members. This is a problem
when pointer to the said struct is invalid (-1).
bpf-translate.cxx: fix build against upcoming `gcc-14` (`-Werror=calloc-transposed-args`)
`gcc-14` added a new `-Wcalloc-transposed-args` warning recently. It
detected minor infelicity in `calloc()` API usage in `systemtap`:
bpf-translate.cxx: In function 'bpf::BPF_Section* bpf::output_probe(BPF_Output&, program&, const std::string&, unsigned int)':
bpf-translate.cxx:5044:39: error: 'void* calloc(size_t, size_t)' sizes specified with 'sizeof' in the earlier argument and not in the later argument [-Werror=calloc-transposed-args]
5044 | bpf_insn *buf = (bpf_insn*) calloc (sizeof(bpf_insn), ninsns);
| ^~~~~~~~~~~~~~~~
bpf-translate.cxx:5044:39: note: earlier argument should specify number of elements, later size of each element
Victor Kamensky [Mon, 18 Dec 2023 05:01:35 +0000 (21:01 -0800)]
Makefile.am: fix build with --with-debuginfod=/path configure option
While I was testing my previous fix with libdebuginfod auto detection
failure I've noticed that configure option --with-debuginfod=/path does
not work in case if system does not have elfutils-debuginfod-client-devel.
I had external elfutils branch with debuginfod metadata change installed
at /path and when I've tried to build SystemTap with it, it was failing
in multiple places. My system is FC38.
It boils dows for couple issues applied in serveral Makefile.am files.
1. util.cxx is C++ file so debuginfod_CFLAGS should be added to _CXXFLAGS
flags, rather then just _CFLAGS
2. debuginfod_LDFLAGS should be added to _LDFLAGS, otherwise link command
does not get proper -L flag
Signed-off-by: Victor Kamensky <victor.kamensky7@gmail.com>
Victor Kamensky [Mon, 18 Dec 2023 05:01:34 +0000 (21:01 -0800)]
configure.ac: fix broken libdebuginfod library auto detection
After 2e67b053e3796ee7cf29a39f9698729b52078406 "configury: rework debuginfod searches"
commit, libdebuginfod.so library auto detection is broken. It was reported by Martin Jansa
on openembedded-core mailing list [1].
Currently configure.ac does "AC_DEFINE([HAVE_LIBDEBUGINFOD], [1] ..." as long as
no --without-debuginfod option is passed, regardless PKG_CHECK_MODULES check result.
It seems to be bad copy/paste. Address the issue by moving the AC_DEFINE back to
PKG_CHECK_MODULES action-if-found block.
To reproduce the issue on FC system, one can do the following
"sudo dnf remove elfutils-debuginfod-client-devel" and then try to build SystemTap
util.cxx will fail to compile because of missing elfutils/debuginfod.h because
config.h will have "#define HAVE_LIBDEBUGINFOD 1", while config.log and configure
output indicates that check for libdebuginfod library failed.
staprun: fix build against upcoming `gcc-14` (`-Werror=calloc-transposed-args`)
`gcc-14` added a new `-Wcalloc-transposed-args` warning recently. It
detected minor infelicity in `calloc()` API usage in `systemtap`:
staprun.c: In function 'main':
staprun.c:550:50: error: 'calloc' sizes specified with 'sizeof' in the earlier argument and not in the later argument [-Werror=calloc-transposed-args]
550 | char ** new_argv = calloc(sizeof(char *),argc+2);
| ^~~~
PR31060: Dynamically switch mount namespaces for the target.
When --target-namespace=PID is specified, the early uprobe registration
and target program file path resolving in task finder should switch the
current mount namespace to the target program's.
Otherwise we'd see errors like this:
ERROR: Couldn't resolve target program file path '/newtarget/a.out': -2
We switch mount namespaces in the kernel space around kern_path() calls
because the kernel module also fiddles with files like relayfs in the
original mount namespace. The granularity of calling setns() on userland is
just too coarse.
For older kernels like those on CentOS/RHEL 7, Ubuntu 16.04 and Debian 9,
we use a hack to call the setns() syscall directly from within the
kernel space. Newer kernels have a good enough kernel C API to emulate
setns() in the kernel space.
Refactored some of the code for kernel.data(*).* probepoints to the stap runtime.
This makes development easier by avoiding updating a lot of code in
the translator all the time.
This also makes the upcoming process.data(*) probepoints easier to
implement.
Additionally, we made it abort when the first hw breakpoint fails
to register instead of going on registering the remaining hw
breakpoints. Added tests to cover this change.
PR31176: fix Spin lock deadlocks in memory pool allocations for mixed NMI and non-NMI contexts
The kernel's lockdep finds dead locks in the stap memory pool allocator's
spinlocks when mixing NMI and non-NMI contexts.
Now we use trylock in the NMI context to avoid waiting forever on a lock
held by a non-NMI context (which is interrupted by NMI).
This bug can be reproduced by running TEST 4 in
testsuite/systemtap.base/kernel-hw-breakpoint-addr.exp on a lockdep
kernel that is recent enough (like 5.11.22).
William Cohen [Mon, 18 Dec 2023 01:40:56 +0000 (20:40 -0500)]
PR30831: Improve error handling of systemtap workqueue
During the startup of systemtap instrumentation a workqueue is
created. It is possible that the kernel is unable to allocate space
for the workqueue. The initialization for this case should return
-ENOMEM to end the loading of the instrumentation module. There are
other ways that the systemtap instrumentation may fail during
initialization after the creation of the workqueue. In those cases
the workqueue needs to be destroyed to avoid leaking resources.
William Cohen [Fri, 15 Dec 2023 02:23:34 +0000 (21:23 -0500)]
Adjust DEBUG_TRANS enabled code to work with current kernels and compilers
The last element in _stp_command_names is in
_stp_command_name[STP_MAX_CMD]. Thus, the array has STP_MAX_CMD+1
elements rather than STP_MAX_CMD elements. The compiler would flag
the access beyond the end of the array due to the incorrectly sized
array.
The min operation provided by the kernel's minmax.h include does type
checking. Needed to type cast STP_MAX_CMD to match the other argument
and avoid the compiler flagging the type mismatch as an error.
William Cohen [Mon, 11 Dec 2023 22:03:18 +0000 (17:03 -0500)]
PR31039: Make autoconf-utrace-via-tracepoints.c work with struct folio
Newer linux kernels are using struct folio in place of struct page.
Kernel git commit 8c9ae56dc73b5ae48 replaces the struct page argument
in should_numa_migrate_memory() with a struct folio. To make the
autoconf-utrace-via-tracepoints.c test compile properly needed to add
an include to pull in the struct folio definition. This allows the
auto_path.exp and many other tests using userspace probes to function
properly on the newer Linux 6.7 kernels.
William Cohen [Wed, 6 Dec 2023 21:36:42 +0000 (16:36 -0500)]
PR27803: Do not remove doc/SystemTap_Tapset_Reference/tapsets.pdf
The "make clean" would remove tapsets.pdf and the "make" does
not rebuild it. tapsets.pdf is built using some other scripts
and is checked into the SystemTap git repository. Changing
the Makefile.am and Makefile.in to avoid removing tapsets.pdf
Martin Cermak [Wed, 6 Dec 2023 08:49:43 +0000 (09:49 +0100)]
testsuite: Use --skip-badvars with syscall tests
Without it, the "big" test module, having syscall.* in it, might
not compile at all. In such case none of the syscall tests can run.
This way we can at least run the tests and possibly KFAIL the test
results where needed.
William Cohen [Tue, 5 Dec 2023 14:55:30 +0000 (09:55 -0500)]
Support kretprobe ABI change in Linux 5.11 kernels
Linux git commit d741bf41d7 changed how to access information
associated with a kretprobe instance. Code should use the
get_kretprobe function to get that information. For older kernels
just define an equivalent get_kretprobe define.
William Cohen [Mon, 4 Dec 2023 16:28:10 +0000 (11:28 -0500)]
PR31074: Ensure that the set_kernel_string* functions limit their writes
Both the set_kernel_string and set_kernel_string_n function use the
underlying _stp_store_deref_string_ function to write strings. There
were two issues with the this function:
1) wrote MAXSTRINGLEN bytes even if string was shorter
2) null write at end could spill past end of buffer
The first issue was addressed by stopping to write once a null
character is encountered. The second issue is a side effect of C
implicit promotion of character constants to ints and was addressed by
explicitlying casting the character constants as a char.
The pr31074.exp test was added to verify that the write length are
limited to string length and the null write does not go beyond the end
of the buffer.
Victor Kamensky [Mon, 4 Dec 2023 03:38:39 +0000 (19:38 -0800)]
Makefile.am: remove runtime/linux/uprobes and runtime/linux/uprobes2 install
"PR30434 continuation: Removed old uprobes, uprobes2 implementation,
uprobes-inc.h & any mentions of CONFIG_UTRACE." commit removed uprobes,
and uprobes2 sources and directories, but Makefile.am still tries to
install them. In fact after failing to 'cd' into runtime/linux/uprobes
directory it copies top level *.[ch] files into
${prefix}/share/systemtap/runtime/linux/uprobes directory.
The issue was caught by OpenEmbedded project do_package_qa checks.
Signed-off-by: Victor Kamensky <victor.kamensky7@gmail.com>
Frank Ch. Eigler [Sun, 12 Nov 2023 12:46:51 +0000 (07:46 -0500)]
systemtap.spec: always build systemtap-jupyter
The build/packaging part of that subsystem does not actually depend on
python or anything. The "make install" configury is entirely unconditional.
So the RPM should also unconditionally package the %files up. (They won't
-run- without python, but that's a runtime dependency issue, not a build-time
one.)
Martin Cermak [Wed, 29 Nov 2023 15:41:51 +0000 (16:41 +0100)]
testsuite: Allow for caching the syscall test module
The systemtap module needed for the syscall test is big, and building
it takes time. In case one tries to fix a syscall test, keeping the
tapset intact, it's handy to cache the module to save time.
However, using the cached module may lead to confusing results in case
it's not used right. To lower the risk, the testcase prints a bold warn
message if the feature is in use.
Here's an example usage:
make installcheck RUNTESTFLAGS=syscall.exp CHECK_ONLY=futex REUSE_MODULE=1
William Cohen [Thu, 9 Nov 2023 14:44:40 +0000 (09:44 -0500)]
Fix map_hash.exp and map_wrap.exp tests
At one time systemtap output hexidecimal numbers for dumping
the associative arrays. It is currently using decimal numbers.
The map_hash.exp and map_wrap.exp tests needed to be adjusted
to recognize the current systemtap output.
PR31054: Non-system-wide perf event probes should not use work queues for registration
The non-system-wide task-finder-based perf event probes do not have the
limitation of always requiring SYS_CAP_ADMIN like the system-wide ones.
So avoiding the work queue thing for this case is a nice-to-have optimization.
Getting rid of work queue in this case also make it easier if we want to
support resolving task executable file paths in another mount namespace
in the future (otherwise the current task context would be kworker
instead of stapio/staprun).
PR31053: memory allocator might sleep in atomic contexts
Previously, STP_ALLOC_FLAGS used in the stap kernel runtime's
memory allocator actually included __GFP_IO and __GFP_FS flags which might
sleep. All modern kernels would trigger this bug.
STP_ALLOC_FLAGS is supposed to be used "anywhere", including atomic
contexts.
debug: added new MTAG macro so that we can know where the leaked allocation comes from.
It is still required to add MTAG invocations to the relavant code. I
prefer not adding them by default to the code base since it gets messy
very quickly.
One sample error line is like this:
ERROR: Memory ffff88811e63b320 len=72 tag=runtime/linux/uprobes-inode.c:556
allocation type: kmalloc. Not freed
In this example, we know the leak comes from the code after the line 556
of the source file uprobes-inode.c and also before the next MTAG
invocation.
The build process (including the testsuite) should not attempt to
modify $srcdir - it might be read-only. Instead, those little files
are simply put into the git repo. Also, switched to a dejagnu library
routine for the target file compilation, with more consistent
diagnostics.
Rework invocation mechanism for the workload test binaries some more.
These are now run from within the systemtap script directly, instead
of funky nested expect { } blocks inside tcl.