]> sourceware.org Git - systemtap.git/log
systemtap.git
7 months agoPR31235: configury
Tim Haines [Fri, 12 Jan 2024 17:07:38 +0000 (12:07 -0500)]
PR31235: configury

Correct nesting of HAVE_DYNINST vs. HAVE_NSS in Makefile.am.

7 months agoPR31215: @__compat_task misbehaves
Martin Cermak [Tue, 9 Jan 2024 08:57:43 +0000 (09:57 +0100)]
PR31215: @__compat_task misbehaves

Fix detection of probing a 32-bit userspace binary on x86_64.

7 months agotestsuite: Fix syscall test runtime problems
Martin Cermak [Mon, 8 Jan 2024 22:12:42 +0000 (23:12 +0100)]
testsuite:  Fix syscall test runtime problems

This update is mostly addressing SEGVs in syscall tests
compiled as a 32 bit binary on x86_64.  The 32-bit syscall
wrappers typically convert one structure to another by
dereferencing individual struct members.  This is a problem
when pointer to the said struct is invalid (-1).

7 months agotestsuite: Drop ancient rhel6-era nfsservctl.c syscall test
Martin Cermak [Fri, 5 Jan 2024 16:39:00 +0000 (17:39 +0100)]
testsuite:  Drop ancient rhel6-era nfsservctl.c syscall test

7 months agoRefresh the syscall number tables
Martin Cermak [Fri, 5 Jan 2024 15:28:57 +0000 (16:28 +0100)]
Refresh the syscall number tables

Refresh the syscall number tables using scripts/dump-syscalls.sh
using strace commit 9c5b28d0dc17361425f8d63290f31722507435b0 .

7 months agoPR29076: syscall test fixes for .rodata on x86_64 for syncfs.c
William Cohen [Thu, 4 Jan 2024 21:10:02 +0000 (16:10 -0500)]
PR29076: syscall test fixes for .rodata on x86_64 for  syncfs.c

7 months agotapset: Fix _struct_timeval_u()
Martin Cermak [Thu, 4 Jan 2024 18:17:28 +0000 (19:17 +0100)]
tapset: Fix _struct_timeval_u()

The utimes, select and settimeofday syscalls switched from using
struct timeval over to struct __kernel_old_timeval.  This changed
in kernel-5.5-rc1.

7 months agotestsuite: Avoid SEGv in systemtap.syscall/select.c
Martin Cermak [Thu, 4 Jan 2024 15:37:07 +0000 (16:37 +0100)]
testsuite: Avoid SEGv in systemtap.syscall/select.c

7 months agotestsuite: Remove parallel testsuite runs relicts
Martin Cermak [Thu, 4 Jan 2024 12:27:31 +0000 (13:27 +0100)]
testsuite:  Remove parallel testsuite runs relicts

Continue cleanup started in commit 12c28db32718 .
The _init and _finish functions are being called
from dejagnu/runtest.exp in case they are present.

7 months agobpf-translate.cxx: fix build against upcoming `gcc-14` (`-Werror=calloc-transposed...
Sergei Trofimovich [Fri, 22 Dec 2023 19:42:38 +0000 (19:42 +0000)]
bpf-translate.cxx: fix build against upcoming `gcc-14` (`-Werror=calloc-transposed-args`)

`gcc-14` added a new `-Wcalloc-transposed-args` warning recently. It
detected minor infelicity in `calloc()` API usage in `systemtap`:

    bpf-translate.cxx: In function 'bpf::BPF_Section* bpf::output_probe(BPF_Output&, program&, const std::string&, unsigned int)':
    bpf-translate.cxx:5044:39: error: 'void* calloc(size_t, size_t)' sizes specified with 'sizeof' in the earlier argument and not in the later argument [-Werror=calloc-transposed-args]
     5044 |   bpf_insn *buf = (bpf_insn*) calloc (sizeof(bpf_insn), ninsns);
          |                                       ^~~~~~~~~~~~~~~~
    bpf-translate.cxx:5044:39: note: earlier argument should specify number of elements, later size of each element

7 months agotestsuite listing_mode.exp: canonicalize subtest names
Frank Ch. Eigler [Sat, 23 Dec 2023 20:12:14 +0000 (15:12 -0500)]
testsuite listing_mode.exp: canonicalize subtest names

Replace hexadecimal literals with HEXADDR.

7 months agoMakefile.am: fix build with --with-debuginfod=/path configure option
Victor Kamensky [Mon, 18 Dec 2023 05:01:35 +0000 (21:01 -0800)]
Makefile.am: fix build with --with-debuginfod=/path configure option

While I was testing my previous fix with libdebuginfod auto detection
failure I've noticed that configure option --with-debuginfod=/path does
not work in case if system does not have elfutils-debuginfod-client-devel.

I had external elfutils branch with debuginfod metadata change installed
at /path and when I've tried to build SystemTap with it, it was failing
in multiple places. My system is FC38.

It boils dows for couple issues applied in serveral Makefile.am files.

1. util.cxx is C++ file so debuginfod_CFLAGS should be added to _CXXFLAGS
flags, rather then just _CFLAGS

2. debuginfod_LDFLAGS should be added to _LDFLAGS, otherwise link command
does not get proper -L flag

Signed-off-by: Victor Kamensky <victor.kamensky7@gmail.com>
7 months agoconfigure.ac: fix broken libdebuginfod library auto detection
Victor Kamensky [Mon, 18 Dec 2023 05:01:34 +0000 (21:01 -0800)]
configure.ac: fix broken libdebuginfod library auto detection

After 2e67b053e3796ee7cf29a39f9698729b52078406 "configury: rework debuginfod searches"
commit, libdebuginfod.so library auto detection is broken. It was reported by Martin Jansa
on openembedded-core mailing list [1].

Currently configure.ac does "AC_DEFINE([HAVE_LIBDEBUGINFOD], [1] ..." as long as
no --without-debuginfod option is passed, regardless PKG_CHECK_MODULES check result.
It seems to be bad copy/paste. Address the issue by moving the AC_DEFINE back to
PKG_CHECK_MODULES action-if-found block.

To reproduce the issue on FC system, one can do the following
"sudo dnf remove elfutils-debuginfod-client-devel" and then try to build SystemTap
util.cxx will fail to compile because of missing elfutils/debuginfod.h because
config.h will have "#define HAVE_LIBDEBUGINFOD 1", while config.log and configure
output indicates that check for libdebuginfod library failed.

[1] https://lists.openembedded.org/g/openembedded-core/message/192109?p=%2C%2C%2C20%2C0%2C0%2C0%3A%3Acreated%2C0%2Csystemtap%2C20%2C2%2C0%2C102987514

Signed-off-by: Victor Kamensky <victor.kamensky7@gmail.com>
7 months agostaprun: fix build against upcoming `gcc-14` (`-Werror=calloc-transposed-args`)
Sergei Trofimovich [Thu, 21 Dec 2023 10:00:06 +0000 (10:00 +0000)]
staprun: fix build against upcoming `gcc-14` (`-Werror=calloc-transposed-args`)

`gcc-14` added a new `-Wcalloc-transposed-args` warning recently. It
detected minor infelicity in `calloc()` API usage in `systemtap`:

    staprun.c: In function 'main':
    staprun.c:550:50: error: 'calloc' sizes specified with 'sizeof' in the earlier argument and not in the later argument [-Werror=calloc-transposed-args]
      550 |                 char ** new_argv = calloc(sizeof(char *),argc+2);
          |                                                  ^~~~

7 months agoPR31060: Dynamically switch mount namespaces for the target.
Yichun Zhang (agentzh) [Mon, 18 Dec 2023 01:45:02 +0000 (17:45 -0800)]
PR31060: Dynamically switch mount namespaces for the target.

When --target-namespace=PID is specified, the early uprobe registration
and target program file path resolving in task finder should switch the
current mount namespace to the target program's.

Otherwise we'd see errors like this:

    ERROR: Couldn't resolve target program file path '/newtarget/a.out': -2

We switch mount namespaces in the kernel space around kern_path() calls
because the kernel module also fiddles with files like relayfs in the
original mount namespace. The granularity of calling setns() on userland is
just too coarse.

For older kernels like those on CentOS/RHEL 7, Ubuntu 16.04 and Debian 9,
we use a hack to call the setns() syscall directly from within the
kernel space. Newer kernels have a good enough kernel C API to emulate
setns() in the kernel space.

7 months agoPR31180: feature: added new probepoint process.data(ADDR).* for userland hardware...
Yichun Zhang (agentzh) [Sun, 17 Dec 2023 06:21:27 +0000 (22:21 -0800)]
PR31180: feature: added new probepoint process.data(ADDR).* for userland hardware watchpoints.

7 months agoRefactored some of the code for kernel.data(*).* probepoints to the stap runtime.
Yichun Zhang (agentzh) [Sun, 17 Dec 2023 04:49:46 +0000 (20:49 -0800)]
Refactored some of the code for kernel.data(*).* probepoints to the stap runtime.

This makes development easier by avoiding updating a lot of code in
the translator all the time.

This also makes the upcoming process.data(*) probepoints easier to
implement.

Additionally, we made it abort when the first hw breakpoint fails
to register instead of going on registering the remaining hw
breakpoints. Added tests to cover this change.

7 months agochange: use bounded loops for spin_trylock() in the NMI context.
Yichun Zhang (agentzh) [Mon, 18 Dec 2023 07:06:25 +0000 (23:06 -0800)]
change: use bounded loops for spin_trylock() in the NMI context.

7 months agoPR31176: fix Spin lock deadlocks in memory pool allocations for mixed NMI and non...
Yichun Zhang (agentzh) [Sun, 17 Dec 2023 01:35:03 +0000 (17:35 -0800)]
PR31176: fix Spin lock deadlocks in memory pool allocations for mixed NMI and non-NMI contexts

The kernel's lockdep finds dead locks in the stap memory pool allocator's
spinlocks when mixing NMI and non-NMI contexts.

Now we use trylock in the NMI context to avoid waiting forever on a lock
held by a non-NMI context (which is interrupted by NMI).

This bug can be reproduced by running TEST 4 in
testsuite/systemtap.base/kernel-hw-breakpoint-addr.exp on a lockdep
kernel that is recent enough (like 5.11.22).

7 months agotests: updated the test cases in kernel-hw-breakpoint-addr for ubuntu and slower...
Yichun Zhang (agentzh) [Mon, 18 Dec 2023 00:47:13 +0000 (16:47 -0800)]
tests: updated the test cases in kernel-hw-breakpoint-addr for ubuntu and slower machines.

7 months agoPR30831: Improve error handling of systemtap workqueue
William Cohen [Mon, 18 Dec 2023 01:40:56 +0000 (20:40 -0500)]
PR30831: Improve error handling of systemtap workqueue

During the startup of systemtap instrumentation a workqueue is
created.  It is possible that the kernel is unable to allocate space
for the workqueue. The initialization for this case should return
-ENOMEM to end the loading of the instrumentation module.  There are
other ways that the systemtap instrumentation may fail during
initialization after the creation of the workqueue.  In those cases
the workqueue needs to be destroyed to avoid leaking resources.

7 months agoAdd more tests to cover the kernel.data(ADDR) probepoints much better.
Yichun Zhang (agentzh) [Sat, 16 Dec 2023 19:17:47 +0000 (11:17 -0800)]
Add more tests to cover the kernel.data(ADDR) probepoints much better.

Also fixed the symbol extraction regex (against nm outputs) in the
test scaffold.

8 months agoAdjust DEBUG_TRANS enabled code to work with current kernels and compilers
William Cohen [Fri, 15 Dec 2023 02:23:34 +0000 (21:23 -0500)]
Adjust DEBUG_TRANS enabled code to work with current kernels and compilers

The last element in _stp_command_names is in
_stp_command_name[STP_MAX_CMD].  Thus, the array has STP_MAX_CMD+1
elements rather than STP_MAX_CMD elements.  The compiler would flag
the access beyond the end of the array due to the incorrectly sized
array.

The min operation provided by the kernel's minmax.h include does type
checking.  Needed to type cast STP_MAX_CMD to match the other argument
and avoid the compiler flagging the type mismatch as an error.

8 months agoPR31039: Make autoconf-utrace-via-tracepoints.c work with struct folio
William Cohen [Mon, 11 Dec 2023 22:03:18 +0000 (17:03 -0500)]
PR31039: Make autoconf-utrace-via-tracepoints.c work with struct folio

Newer linux kernels are using struct folio in place of struct page.
Kernel git commit 8c9ae56dc73b5ae48 replaces the struct page argument
in should_numa_migrate_memory() with a struct folio.  To make the
autoconf-utrace-via-tracepoints.c test compile properly needed to add
an include to pull in the struct folio definition.  This allows the
auto_path.exp and many other tests using userspace probes to function
properly on the newer Linux 6.7 kernels.

8 months agoPR31119: translator: sign-extension might yield wrong results in string escaping...
Yichun Zhang (agentzh) [Fri, 8 Dec 2023 21:35:43 +0000 (13:35 -0800)]
PR31119: translator: sign-extension might yield wrong results in string escaping code.

The C++ sign-extensiosn might lead to wrong results in the current escapging
code:

`(unsigned)str[I]` is equivalent to `(unsigned)(int)str[i]`.

Thanks Junlong Li for the original patch.

8 months agoPR27803: Do not remove doc/SystemTap_Tapset_Reference/tapsets.pdf
William Cohen [Wed, 6 Dec 2023 21:36:42 +0000 (16:36 -0500)]
PR27803: Do not remove doc/SystemTap_Tapset_Reference/tapsets.pdf

The "make clean" would remove tapsets.pdf and the "make" does
not rebuild it.  tapsets.pdf is built using some other scripts
and is checked into the SystemTap git repository.  Changing
the Makefile.am and Makefile.in to avoid removing tapsets.pdf

8 months agotestsuite: Use --skip-badvars with syscall tests (cont'd)
Martin Cermak [Wed, 6 Dec 2023 11:45:30 +0000 (12:45 +0100)]
testsuite:  Use --skip-badvars with syscall tests (cont'd)

8 months agotestsuite: Use --skip-badvars with syscall tests
Martin Cermak [Wed, 6 Dec 2023 08:49:43 +0000 (09:49 +0100)]
testsuite:  Use --skip-badvars with syscall tests

Without it, the "big" test module, having syscall.* in it, might
not compile at all. In such case none of the syscall tests can run.
This way we can at least run the tests and possibly KFAIL the test
results where needed.

8 months agoSupport kretprobe ABI change in Linux 5.11 kernels
William Cohen [Tue, 5 Dec 2023 14:55:30 +0000 (09:55 -0500)]
Support kretprobe ABI change in Linux 5.11 kernels

Linux git commit d741bf41d7 changed how to access information
associated with a kretprobe instance.  Code should use the
get_kretprobe function to get that information.  For older kernels
just define an equivalent get_kretprobe define.

8 months agoPR31074: Ensure that the set_kernel_string* functions limit their writes
William Cohen [Mon, 4 Dec 2023 16:28:10 +0000 (11:28 -0500)]
PR31074: Ensure that the set_kernel_string* functions limit their writes

Both the set_kernel_string and set_kernel_string_n function use the
underlying _stp_store_deref_string_ function to write strings.  There
were two issues with the this function:

 1) wrote MAXSTRINGLEN bytes even if string was shorter
 2) null write at end could spill past end of buffer

The first issue was addressed by stopping to write once a null
character is encountered.  The second issue is a side effect of C
implicit promotion of character constants to ints and was addressed by
explicitlying casting the character constants as a char.

The pr31074.exp test was added to verify that the write length are
limited to string length and the null write does not go beyond the end
of the buffer.

8 months agoMakefile.am: remove runtime/linux/uprobes and runtime/linux/uprobes2 install
Victor Kamensky [Mon, 4 Dec 2023 03:38:39 +0000 (19:38 -0800)]
Makefile.am: remove runtime/linux/uprobes and runtime/linux/uprobes2 install

"PR30434 continuation:  Removed old uprobes, uprobes2 implementation,
uprobes-inc.h & any mentions of CONFIG_UTRACE." commit removed uprobes,
and uprobes2 sources and directories, but Makefile.am still tries to
install them. In fact after failing to 'cd' into runtime/linux/uprobes
directory it copies top level *.[ch] files into
${prefix}/share/systemtap/runtime/linux/uprobes directory.

The issue was caught by OpenEmbedded project do_package_qa checks.

Signed-off-by: Victor Kamensky <victor.kamensky7@gmail.com>
8 months agosystemtap.spec: always build systemtap-jupyter
Frank Ch. Eigler [Sun, 12 Nov 2023 12:46:51 +0000 (07:46 -0500)]
systemtap.spec: always build systemtap-jupyter

The build/packaging part of that subsystem does not actually depend on
python or anything.  The "make install" configury is entirely unconditional.
So the RPM should also unconditionally package the %files up.  (They won't
-run- without python, but that's a runtime dependency issue, not a build-time
one.)

8 months agotestsuite: Allow for caching the syscall test module
Martin Cermak [Wed, 29 Nov 2023 15:41:51 +0000 (16:41 +0100)]
testsuite: Allow for caching the syscall test module

The systemtap module needed for the syscall test is big, and building
it takes time.  In case one tries to fix a syscall test, keeping the
tapset intact, it's handy to cache the module to save time.

However, using the cached module may lead to confusing results in case
it's not used right.  To lower the risk, the testcase prints a bold warn
message if the feature is in use.

Here's an example usage:
make installcheck RUNTESTFLAGS=syscall.exp CHECK_ONLY=futex REUSE_MODULE=1

8 months agoTestsuite: syscall/clock.c: RHEL8 improvements
Martin Cermak [Mon, 27 Nov 2023 16:22:09 +0000 (17:22 +0100)]
Testsuite: syscall/clock.c: RHEL8 improvements

8 months agopost-release 5.1 version bump
Frank Ch. Eigler [Tue, 21 Nov 2023 23:57:44 +0000 (18:57 -0500)]
post-release 5.1 version bump

8 months agoTweak testsuite/semok/target_addr.stp to work with linux 5.14 and newer.
William Cohen [Tue, 21 Nov 2023 23:56:42 +0000 (18:56 -0500)]
Tweak testsuite/semok/target_addr.stp to work with linux 5.14 and newer.

9 months agoPR29076: syscall test fixes for .rodata on x86_64 for pwritev.c and sysfs.c
William Cohen [Wed, 15 Nov 2023 22:01:14 +0000 (17:01 -0500)]
PR29076: syscall test fixes for .rodata on x86_64 for pwritev.c and sysfs.c

9 months agoPR29076: Additional syscall test fixes for .rodata on x86_64
William Cohen [Wed, 15 Nov 2023 20:40:07 +0000 (15:40 -0500)]
PR29076: Additional syscall test fixes for .rodata on x86_64

There were a number of additional syscall tests that needed fixes
similar to git commit 6577f2237.

9 months agoFix map_hash.exp and map_wrap.exp tests
William Cohen [Thu, 9 Nov 2023 14:44:40 +0000 (09:44 -0500)]
Fix map_hash.exp and map_wrap.exp tests

At one time systemtap output hexidecimal numbers for dumping
the associative arrays.  It is currently using decimal numbers.
The map_hash.exp and map_wrap.exp tests needed to be adjusted
to recognize the current systemtap output.

9 months agobugfix: older gcc versions do not have the -Wno-infinite-recursion option.
Yichun Zhang (agentzh) [Mon, 13 Nov 2023 22:17:47 +0000 (14:17 -0800)]
bugfix: older gcc versions do not have the -Wno-infinite-recursion option.

This affect older gcc versions used to compile the kernel.

9 months agoPR31054: Non-system-wide perf event probes should not use work queues for registration
Yichun Zhang (agentzh) [Sat, 11 Nov 2023 01:46:41 +0000 (17:46 -0800)]
PR31054: Non-system-wide perf event probes should not use work queues for registration

The non-system-wide task-finder-based perf event probes do not have the
limitation of always requiring SYS_CAP_ADMIN like the system-wide ones.
So avoiding the work queue thing for this case is a nice-to-have optimization.

Getting rid of work queue in this case also make it easier if we want to
support resolving task executable file paths in another mount namespace
in the future (otherwise the current task context would be kworker
instead of stapio/staprun).

9 months agofeature: stap -t now outputs module init timing reports.
Yichun Zhang (agentzh) [Sun, 12 Nov 2023 05:32:29 +0000 (21:32 -0800)]
feature: stap -t now outputs module init timing reports.

9 months agotranslator: fixed output code indentation regarding stap -t.
Yichun Zhang (agentzh) [Sun, 12 Nov 2023 04:16:12 +0000 (20:16 -0800)]
translator: fixed output code indentation regarding stap -t.

9 months agoPR31052: -DDEBUG_MEMALLOC_MIGHT_SLEEP resulted in lots of kernel errors
Yichun Zhang (agentzh) [Sat, 11 Nov 2023 07:56:13 +0000 (23:56 -0800)]
PR31052: -DDEBUG_MEMALLOC_MIGHT_SLEEP resulted in lots of kernel errors

We should really enable `might_sleep()` only when the gfp_mask might sleep.

9 months agoPR31053: memory allocator might sleep in atomic contexts
Yichun Zhang (agentzh) [Sat, 11 Nov 2023 07:25:54 +0000 (23:25 -0800)]
PR31053: memory allocator might sleep in atomic contexts

Previously, STP_ALLOC_FLAGS used in the stap kernel runtime's
memory allocator actually included __GFP_IO and __GFP_FS flags which might
sleep. All modern kernels would trigger this bug.

STP_ALLOC_FLAGS is supposed to be used "anywhere", including atomic
contexts.

9 months agoPR31051: memory and uprobe leaks in early uprobe registraton code when errors happen
Yichun Zhang (agentzh) [Sat, 11 Nov 2023 05:51:56 +0000 (21:51 -0800)]
PR31051: memory and uprobe leaks in early uprobe registraton code when errors happen

9 months agodebug: added new MTAG macro so that we can know where the leaked allocation comes...
Yichun Zhang (agentzh) [Sat, 11 Nov 2023 05:35:39 +0000 (21:35 -0800)]
debug: added new MTAG macro so that we can know where the leaked allocation comes from.

It is still required to add MTAG invocations to the relavant code. I
prefer not adding them by default to the code base since it gets messy
very quickly.

One sample error line is like this:

    ERROR: Memory ffff88811e63b320 len=72 tag=runtime/linux/uprobes-inode.c:556
        allocation type: kmalloc. Not freed

In this example, we know the leak comes from the code after the line 556
of the source file uprobes-inode.c and also before the next MTAG
invocation.

9 months agotestsuite: PR30407 cleanup
Frank Ch. Eigler [Wed, 8 Nov 2023 00:55:26 +0000 (19:55 -0500)]
testsuite: PR30407 cleanup

The build process (including the testsuite) should not attempt to
modify $srcdir - it might be read-only.  Instead, those little files
are simply put into the git repo.  Also, switched to a dejagnu library
routine for the target file compilation, with more consistent
diagnostics.

9 months agosystemtap.spec: Correct 5.0 release timestamp
Frank Ch. Eigler [Wed, 8 Nov 2023 00:11:49 +0000 (19:11 -0500)]
systemtap.spec: Correct 5.0 release timestamp

9 months agotestsuite: auto_path.exp
Frank Ch. Eigler [Wed, 8 Nov 2023 00:10:48 +0000 (19:10 -0500)]
testsuite: auto_path.exp

Rework invocation mechanism for the workload test binaries some more.
These are now run from within the systemtap script directly, instead
of funky nested expect { } blocks inside tcl.

9 months agoprerelease datestamp fixes
Frank Ch. Eigler [Sat, 4 Nov 2023 16:19:59 +0000 (12:19 -0400)]
prerelease datestamp fixes

9 months agoFix timing issue causing failures of testcase auto_path.exp
Martin Cermak [Tue, 7 Nov 2023 11:04:16 +0000 (12:04 +0100)]
Fix timing issue causing failures of testcase auto_path.exp

9 months agoPR31028: Handle .callee probes for DWARF5 information release-5.0a
William Cohen [Fri, 3 Nov 2023 16:09:19 +0000 (12:09 -0400)]
PR31028: Handle .callee probes for DWARF5 information

DWARF5 stores call site information differently than DWARF4. The
DWARF5 call sites are tagged with DW_TAG_call_site in place of
DWARF4's DW_TAG_GNU_call_site and need to reference the
DW_AT_call_origin rather than DW_AT_abstract_origin.

Without this patch any of the tests in the
systemtap.base/listing_mode.exp testsuite using .callee() would fail
to find the associated probe point.

9 months agoDrop the tapset_functions.exp
Martin Cermak [Fri, 3 Nov 2023 14:44:42 +0000 (15:44 +0100)]
Drop the tapset_functions.exp

Drop the tapset_functions.exp.  It needs constant maintenance of the
blocklist as the systemtap codebase grows.  It's also using a
--compatible hack to access private functions.  After a discussion on
the #systemtap IRC channel the decision was to drop the testcase.

9 months agopre-release AUTHORS bump
Frank Ch. Eigler [Thu, 2 Nov 2023 20:38:28 +0000 (16:38 -0400)]
pre-release AUTHORS bump

9 months agopre-release NEWS refresh
Frank Ch. Eigler [Thu, 2 Nov 2023 20:37:09 +0000 (16:37 -0400)]
pre-release NEWS refresh

9 months agosystemtap.spec: reenable crash
Frank Ch. Eigler [Thu, 2 Nov 2023 20:36:48 +0000 (16:36 -0400)]
systemtap.spec: reenable crash

https://bugzilla.redhat.com/show_bug.cgi?id=2219728 has been solved

9 months agoAvoid conflicting with statx declaration provided by glibc 2.28 and newer.
William Cohen [Wed, 1 Nov 2023 23:58:00 +0000 (19:58 -0400)]
Avoid conflicting with statx declaration provided by glibc 2.28 and newer.

9 months agoPR31014: Uprobes registered in task finder would block the target processes for long...
Yichun Zhang (agentzh) [Mon, 30 Oct 2023 07:46:36 +0000 (00:46 -0700)]
PR31014: Uprobes registered in task finder would block the target processes for long time.

Currently, the stap runtime tries to call uprobe_register() to register new
uprobes inside the context of the target processes via the task finder.
This makes little sense for inode-based uprobes implemented in modern kernels
and also introduces significant (8ms+) delay in all the target processes via
the task work mechanism. Such delays may manifest themselves in syscalls like
epoll_wait inside the target processes.

All the target processes are affected even though only one task finder
callback gets to call uprobe_register(). Other concurrent target processes
would just wait on the lock `c->consumer_lock`.

After talking with fche, I'd propose that we should register uprobes early
before starting the task finder callbacks. This still may block a CPU core
for 8ms+ or so, but it no longer blocks all the target processes in their
own process contexts.

9 months agoPR31024: always compare the real inodes for files in overlay or union fs.
Yichun Zhang (agentzh) [Wed, 1 Nov 2023 03:16:43 +0000 (20:16 -0700)]
PR31024: always compare the real inodes for files in overlay or union fs.

overlay fs is popular in containerized applications and currently the stap
runtime does not always use the real inode for the target processes/programs.
This may make the uprobes registered on the fake inode insead of the true ones,
making the probe handlers never fire.

9 months agoPR31020: runtime: task_finder2: we didn't resolve symlink PATH in "probe process...
Yichun Zhang (agentzh) [Wed, 1 Nov 2023 06:15:30 +0000 (23:15 -0700)]
PR31020: runtime: task_finder2: we didn't resolve symlink PATH in "probe process(PATH).*"

Use of symlinks in PATH might confuse the task finder when matching
procname.

9 months agoTurn off gcc stringop-overflow warnings for syscall.exp tests
William Cohen [Wed, 1 Nov 2023 19:29:58 +0000 (15:29 -0400)]
Turn off gcc stringop-overflow warnings for syscall.exp tests

On RHEL9 and Fedora the gcc compiler warns about string manipulations
that could overflow desitnation buffer.  When the compiler warns about
these for the C code being compiled the actual systemtap test does not
run.  Passing in -Wno-stringop-overflow to the compiler to turn those
warnings off and allow the tests to run.  This eliminates about 110 of
the untested test on Fedora Rawhide.

9 months agopre-release sample index regen
Frank Ch. Eigler [Wed, 1 Nov 2023 16:59:32 +0000 (12:59 -0400)]
pre-release sample index regen

9 months agopre-release update-docs
Frank Ch. Eigler [Wed, 1 Nov 2023 16:57:50 +0000 (12:57 -0400)]
pre-release update-docs

9 months agopre-release PRERELEASE marker updates
Frank Ch. Eigler [Wed, 1 Nov 2023 16:26:44 +0000 (12:26 -0400)]
pre-release PRERELEASE marker updates

9 months agopre-release update-po
Frank Ch. Eigler [Wed, 1 Nov 2023 16:24:01 +0000 (12:24 -0400)]
pre-release update-po

9 months agoPR31018: Map operations might get no lock protections due to "pushdown lock" bugs.
Yichun Zhang (agentzh) [Tue, 31 Oct 2023 20:22:48 +0000 (13:22 -0700)]
PR31018: Map operations might get no lock protections due to "pushdown lock" bugs.

The stap translator's "pushdown lock" optimization is buggy in that it may not
emit locking code for all the code branches in the first statement needing a
lock, making subsequent statements actually needing a lock go without a lock.

The correct way is to always emit locking code for statements that might
"push down" the locks, until seeing a simple statements what don't
(like an assignment statement using "<<<").

I forgot to add that it might not necessarily be triggered by a leading "if"
statement needing a lock. Other statements that "push down" locks may
also trigger it, like nested code blocks with "if" statements.

9 months agoPR30407: Address context.exp findings
Martin Cermak [Tue, 31 Oct 2023 22:02:28 +0000 (23:02 +0100)]
PR30407: Address context.exp findings

9 months agoFix UNTESTED: retblocklist
William Cohen [Tue, 31 Oct 2023 02:00:12 +0000 (22:00 -0400)]
Fix UNTESTED: retblocklist

The file renaming in git commit d1804e051dd missed a couple files.
The result was that the retblock.exp test could not compile the
associated .c and .stp files for the test.  Moved the needed files to
retblocklist.c and retblocklist.stp and the test now runs.

9 months agoPR31012: kernel module loading might block a CPU for 1.5+ms.
Yichun Zhang (agentzh) [Mon, 30 Oct 2023 19:24:50 +0000 (12:24 -0700)]
PR31012: kernel module loading might block a CPU for 1.5+ms.

Currently, the module init code calls the expensive kernel function
kallsyms_lookup_name() several times, which would block a CPU core for
more than 1.5ms on high-speed machines (Core i9-13900K).

Each kallsyms_lookup_name() might take more than 800us here.

We should add cond_resched() calls to mitigate this blocking effect.

9 months agoPR31013: Use of sleeping _stp_stat_del() operations in atomic contexts for -t.
Yichun Zhang (agentzh) [Mon, 30 Oct 2023 21:18:08 +0000 (14:18 -0700)]
PR31013: Use of sleeping _stp_stat_del() operations in atomic contexts for -t.

When stap's -t option is specified, the stat data needed by the timing stats
are freed in atomic contexts (with preemption disabled), which may cause
kernel deadlocks.

This is because the `_stp_cleanup_and_exit` function in the runtime explicitly
disables preemption for `_stp_printf`. But we should really temporarily
re-enable premption for sleeping operations like `_stp_stat_del()`.

9 months agofeature: Add new runtime macro STP_FORCE_STDOUT_TTY to override STP_STDOUT_NOT_ATTY.
Yichun Zhang (agentzh) [Mon, 30 Oct 2023 06:13:15 +0000 (23:13 -0700)]
feature: Add new runtime macro STP_FORCE_STDOUT_TTY to override STP_STDOUT_NOT_ATTY.

9 months agoPR31011: fixed memory leaks when the -t option is specified and the stdout stream...
Yichun Zhang (agentzh) [Mon, 30 Oct 2023 05:22:27 +0000 (22:22 -0700)]
PR31011: fixed memory leaks when the -t option is specified and the stdout stream is not a tty.

Also add a new runtime macro STP_TIMING_NSECS to report probe timing stats in
nsecs instead of cycles.

9 months agoNEWS, testsuite/README: simplify, clarify
Frank Ch. Eigler [Mon, 30 Oct 2023 18:13:36 +0000 (14:13 -0400)]
NEWS, testsuite/README: simplify, clarify

Mention the testsuite Makefile changes, and simplify other NEWS.

9 months agoAlign the tapset invocation of _stp_snprint_addr() with commit 367392917
Martin Cermak [Fri, 27 Oct 2023 13:53:49 +0000 (15:53 +0200)]
Align the tapset invocation of _stp_snprint_addr() with commit 367392917

Commit 367392917 adds a new 'context' parameter to _stp_snprint_addr().
Tapset using this function wasn't updated accordingly.  Fix this now.

9 months agoNEWS: abbreviate debuginfod blurbage
Frank Ch. Eigler [Wed, 25 Oct 2023 19:28:53 +0000 (15:28 -0400)]
NEWS: abbreviate debuginfod blurbage

9 months agotranslator: fix derived-chain elaboration procedure for debuginfod.* probes etc.
Frank Ch. Eigler [Wed, 25 Oct 2023 19:12:52 +0000 (15:12 -0400)]
translator: fix derived-chain elaboration procedure for debuginfod.* probes etc.

Previous to this patch, stap -L / -p2 mode outputs for debuginfod.*
probes show relatively uninformative process("buildid").* probe
points, losing all track of what archive and especially what program
name those probes were derived from.  Now we more carefully construct
the derivation chain for debuginfod.* probes.  For probes that contain
wildcards in the .archive() or .process() names, an extra intermediate
probe is inserted into the chain, with the wildcards expanded.

stap -vv -L   now lists the entire probe /* derivation chain */.

The heuristics as to what canonical name to use rides on a probe-level
flag called "well_formed".  This is intended to bypass those probes
that have wildcards in them, so that the end-user would see the
expanded forms.  However, this flag cannot be computed locally & correctly
because wildcards can pop up in different levels of probe resolution.

Consider: debuginfod.archive("*").process("foo").function("*") When
the debuginfod builder resolves the first wildcard, the probe point is
"well formed" with respect to its wildcards, but another one still
remains down there in .function("*").  So it's not overall "well
formed", just locally.  What we need is something like a
probe-point-component level "well-formed"ness, which aggregates via
"and" overall.  Anyway, that's for later.

9 months agoAdded test case and documentation to NEWS
Housam Alamour [Mon, 23 Oct 2023 00:09:59 +0000 (20:09 -0400)]
Added test case and documentation to NEWS

9 months agoChange from "package" to "archive" naming convention.
Housam Alamour [Wed, 18 Oct 2023 18:46:16 +0000 (14:46 -0400)]
Change from "package" to "archive" naming convention.

9 months agoconfigury: rework debuginfod searches
Frank Ch. Eigler [Tue, 17 Oct 2023 18:24:59 +0000 (14:24 -0400)]
configury: rework debuginfod searches

Support --with-debuginfod=/PATH mode invocation to point at an
elfutils $prefix (install tree base).  This is necessary to build
metadata query support, which is not yet in system elfutils libraries.

9 months agoPR30803: strip .package() probe-point component from subsidiary probes
Frank Ch. Eigler [Tue, 17 Oct 2023 16:20:22 +0000 (12:20 -0400)]
PR30803: strip .package() probe-point component from subsidiary probes

9 months agoFix for case where empty string is input for .process($archive) argument
Housam Alamour [Thu, 12 Oct 2023 20:40:43 +0000 (16:40 -0400)]
Fix for case where empty string is input for .process($archive) argument

9 months agoFix to search only the archive base filename instead of the archive full path for...
Housam Alamour [Tue, 10 Oct 2023 20:37:17 +0000 (16:37 -0400)]
Fix to search only the archive base filename instead of the archive full path for the package string

9 months agoPR30803: *tapset-debuginfod.cxx
Housam Alamour [Fri, 6 Oct 2023 21:12:19 +0000 (17:12 -0400)]
PR30803: *tapset-debuginfod.cxx
Added package to the parameters.

9 months agoPR30987: Exclude strlcpy and strlcat for glibc 2.38 and newer
William Cohen [Wed, 25 Oct 2023 15:51:17 +0000 (11:51 -0400)]
PR30987: Exclude strlcpy and strlcat for glibc 2.38 and newer

The glibc library added strlcpy and strlcat.  The inlined functions in
runtime/dyninst/linux_defs.h conflicted with the glibc declarations of
those functions in /usr/include/string.h.  Now linux_defs.h only
defines those functions for older version of glibc that do not include
them.

9 months agoPR30407: Add DWARF5 support to print_ubacktrace_fileline()
Martin Cermak [Wed, 25 Oct 2023 09:59:12 +0000 (11:59 +0200)]
PR30407: Add DWARF5 support to print_ubacktrace_fileline()

Add DWARF5 support to the tapsef function print_ubacktrace_fileline().

DWARF5 comes with a different line number program header compared to
DWARF4, see section 6.2.4 of the DWARF5 standard.  The DWARF5 specific
part of the header is now processed in a new _stp_filename_lookup_5().

This _stp_filename_lookup_5() needs to parse relatively big chunk of
information from DWARF (paths and file names).  Storing these within
local arrays may cause problems with the stack size.  For this reason
the storage was moved out to the context struct (dw_data).  For this
reason, various callers need to pass the pointer to the context struct
to their callees.  Finally _stp_filename_lookup_5() needs to see the
context struct at the KO compile time.  Since the context struct is
emitted to the stap_XXXXXX_src.i (within s.up->emit_common_header ())
after the main header file for Linux runtime.h (which includes sym.c),
the _stp_filename_lookup_5() was separated out into sym2.c for later
emission.

This update comes with a testcase systemtap.base/pr30407.exp.

9 months agoAdd wget requires to systemtap-testsuite for systemtap.http_exporter test.
William Cohen [Mon, 23 Oct 2023 18:42:20 +0000 (14:42 -0400)]
Add wget requires to systemtap-testsuite for systemtap.http_exporter test.

9 months agoAllow sdt_buildid.exp to run with and without DEBUGINFOD_URLS environment var
William Cohen [Fri, 20 Oct 2023 15:29:03 +0000 (11:29 -0400)]
Allow sdt_buildid.exp to run with and without DEBUGINFOD_URLS environment var

The sdt_buildid.exp test would have an error and fail to run if the
environment did not have DEBUGINFOD_URLS.  It now checks that the
DEBUGINFOD_URLS environment variable is available before trying to use
it.

10 months agobuildbot test, ignore
Frank Ch. Eigler [Fri, 13 Oct 2023 02:42:52 +0000 (22:42 -0400)]
buildbot test, ignore

10 months agoPR30401: Address newer s390 kernels that move struct stack_frame
William Cohen [Thu, 12 Oct 2023 03:10:21 +0000 (23:10 -0400)]
PR30401: Address newer s390 kernels that move struct stack_frame

Linux git commit 78c98f907413 moved struct stack_frame
<asm/processor.h> to a newly created <asm/stacktrace.h>.  As a result
the struct definition does not get pulled in by the existing
<asm/ptrace.h> include for the newer kernels.  Have a autoconf test to
determine whether the <asm/stacktrace.h> exists and uses it if it
avaialble.

10 months agotestsuite: add cve-2023-4911 band-aid
Frank Ch. Eigler [Thu, 12 Oct 2023 17:10:53 +0000 (13:10 -0400)]
testsuite: add cve-2023-4911 band-aid

As seen on TV ^W https://access.redhat.com/security/cve/CVE-2023-4911

10 months agotestsuite: drop busybox test case
Frank Ch. Eigler [Thu, 12 Oct 2023 17:02:39 +0000 (13:02 -0400)]
testsuite: drop busybox test case

This test case (with an old fixed version of busybox) has been a
problem with respect to compatibility with newer libc/kernels,
with readonly source trees, and doesn't seem to contribute much
to test value.  So nuke it all.

10 months agotestsuite: simplify Makefile drivers
Frank Ch. Eigler [Thu, 12 Oct 2023 16:45:09 +0000 (12:45 -0400)]
testsuite: simplify Makefile drivers

It was reported that "make installcheck" broke with commit
218c26a523816.  Investigation pointed at quoting mishaps somewhere in
the Makefile machinery related to parallel / partial testsuite runs.
While this logic was clever and appeared useful for a time, it seems
fragile in practice and not in active use after all.  So let's nuke
all of it.

Moved the environment_sanity.exp test into the systemtap/ subdirectory
to apprx. guarantee that it's run first, no Makefile magic needed.  It
still exits dejagnu entirely if it fails.

10 months agoTeach stap-prep to use new versions of dnf (if available)
Martin Cermak [Thu, 12 Oct 2023 14:55:47 +0000 (16:55 +0200)]
Teach stap-prep to use new versions of dnf (if available)

10 months agodocs: mark another timestamp with PRERELEASE
Frank Ch. Eigler [Thu, 12 Oct 2023 01:02:52 +0000 (21:02 -0400)]
docs: mark another timestamp with PRERELEASE

10 months agotestsuite: finish removing dejazilla artifacts, restoring dejagnu *check rc
Frank Ch. Eigler [Wed, 11 Oct 2023 23:44:05 +0000 (19:44 -0400)]
testsuite: finish removing dejazilla artifacts, restoring dejagnu *check rc

Way back during early dejazilla days, the "check-local" target was
needed in order to liberate test results, regardless of pass/failure
of the dejagnu suite.  To make that work, an "execrc" wrapper was
interposed between make and dejagnu/runtest, to turn everything into
a pass rc=0.

Dejagnu support was removed in 2022, so this execrc hack is not needed
any more.  Tests that fail, especially the early
systemtap.base/environment_sanity.exp one, should stand out better
in buildbot reports.

10 months agoRHEL-12499: tweak stap-prep sans-debuginfod notice
Frank Ch. Eigler [Wed, 11 Oct 2023 14:56:14 +0000 (10:56 -0400)]
RHEL-12499: tweak stap-prep sans-debuginfod notice

Try harder not to alarm people.

10 months agoPR27410 cont'd: Tolerate (skip) foreign-architecture binaries via debuginfod
Frank Ch. Eigler [Fri, 6 Oct 2023 14:44:04 +0000 (10:44 -0400)]
PR27410 cont'd: Tolerate (skip) foreign-architecture binaries via debuginfod

With debuginfod path probes, it is easy to refer to a whole slew of
binaries.  That's a good thing, but debuginfod may be so well informed
that it sends stap buildids of foreign-architecture binaries too.
systemtap should skip these guys instead of having a cow.

New common tapsets.cxx code makes architecture mismatch generally a
warning rather than a direct semantic error.  This has effects beyond
the debuginfod based probes, but that should be fine.
Bad-architecture target binaries will just be skipped in other
contexts too.  Tweaked debuginfod.process() builder code makes the
subsidiary buildid-based process probes all optional, as though they
were identified by glob ... which in a manner of speaking they were.

10 months agoEliminate use of kernel's flush_scheduled_work() in systemtap modules
William Cohen [Wed, 27 Sep 2023 14:09:11 +0000 (10:09 -0400)]
Eliminate use of kernel's flush_scheduled_work() in systemtap modules

Kernel git commit 20bdedafd2f63e0ba70991127f9b5c0826ebdb32 turns use
of flush_scheduled_work() into a warning which causes builds of
anything using it to fail because warnings are treated as errors.
Previous users of flush_scheduled_work() in the kernel have been
converted over to use individual workqueues.  Systemtap runtime now
does the same.  It creates and uses its own workqueue to eliminate the
use of flush_scheduled_work().

10 months agoUse TWA_RESUME in the runtime calls to task_work_add
William Cohen [Fri, 15 Sep 2023 19:12:11 +0000 (15:12 -0400)]
Use TWA_RESUME in the runtime calls to task_work_add

Kernel git commit c40e60f00caf18bc382215c79651777eb40f5f9d in the
linux 6.6 kernels will cause the implicit conversion of boolean true
to an enum task_work_notify_mode to be flagged resulting in the
systemtap instrumentation compile to fail.  Adding a config check to
use TWA_RESUME if it is available or pass in the equivalent true
argument for older kernels.

This page took 0.071565 seconds and 5 git commands to generate.