William Cohen [Tue, 30 Oct 2018 18:20:46 +0000 (14:20 -0400)]
Adjust the BPF translate error report formatting to work on 32-bit architectures
The 32-bit architectures such as arm and i686 had arguments in the
error reporting that did not match up with the %lu or %ld formatting.
Used type casting and %llu and %lld to avoid variation between 32-bit
and 64-bit architectures.
Serhei Makarov [Wed, 24 Oct 2018 20:04:30 +0000 (16:04 -0400)]
Merge branch 'serhei/bpf_asm' -- kernel_string() tapset and experimental bpf assembler
Note the big comment in bpf-translate.cxx explaining the new assembler.
Major changes:
- Embedded-code assembler
- TODO Embedded-code tapset function call support is incomplete, only enabled for exit()
- TODO Token adjustment for assembler diagnostics needs work.
- Refactor bpf_unparser
- Improved metadata about helpers
- String handling changes
- Misc cleanup/notes
Tapset for kernel_string:
* tapset/bpf/conversions.stp: New file.
(kernel_string): New function.
(kernel_string): New function (err_msg version), in assembly.
(kernel_string_n): New function, in assembly.
* testsuite/systemtap.bpf/bpf_tests/context_vars3.stp: New testcase.
Embedded-code assembler:
* bpf-translate.cxx (bpf_unparser::parse_imm): New function.
(bpf_unparser::parse_asm_stmt): New function.
(bpf_unparser::emit_asm_arg): New function.
(bpf_unparser::parse_reg): Removed.
(bpf_unparser::emit_asm_reg): New function.
(bpf_unparser::get_asm_reg): New function.
(bpf_unparser::emit_asm_opcode): New function.
(bpf_unparser::visit_embeddedcode): Process new assembly format.
(BPF_ASM_DEBUG): New (disabled) macro for diagnostics.
(struct asm_stmt): New structure.
(operator <<): New function -- print logic for asm_stmt.
(is_numeric): New function.
* testsuite/systemtap.bpf/asm_tests/*: New testcases for embedded-code assembler.
* testsuite/systemtap.bpf/bpf-asm.exp: TODO Initial test driver for embedded-code assembler.
TODO Embedded-code tapset function call support is incomplete, only enabled for exit():
* bpf-translate.cxx (translate_bpf_pass): Pass systemtap_session to assembler globals.
* bpf-internal.h (globals::session): New field to pass systemtap_session to assembler.
TODO Token adjustment for assembler diagnostics needs work:
* parse.h (token::adjust_location): New function.
Refactor bpf_unparser:
* bpf-translate.cxx (bpf_unparser::emit_functioncall): New function.
(bpf_unparser::visit_functioncall): Use emit_functioncall.
(print_format_add_tag): New function on std::string.
(bpf_unparser::emit_print_format): New function.
(bpf_unparser::visit_print_format): Use print_format_add_tag, emit_print_format.
Improved metadata about helpers:
* bpf-base.cxx (bpf_func_name_map): New structure -- id->name map.
(bpf_func_id_map): New structure -- name->id map.
(init_bpf_helper_tables): New function -- populate name->id and id->name map.
(bpf_function_name): Change to use the maps.
(bpf_function_id): New function -- map from name to helper id.
(bpf_function_nargs): TODO Still need to expand the list of helpers.
* bpf-translate.cxx (translate_bpf_pass): Call init_bpf_helper_tables to populate info.
* bpf-internal.h (init_bpf_helper_table): New function.
(bpf_function_id): New function.
(__STAPBPF_FUNC_MAPPER): New macro -- like __BPF_FUNC_MAPPER for userspace-only helpers.
String handling changes:
* bpf-translate.cxx (bpf_unparser::emit_literal_str): New function.
(bpf_unparser::visit_literal_str): Use emit_literal_str.
(emit_simple_literal_str): Renamed from emit_literal_str.
(bpf_unparser::emit_string_copy): Renamed from emit_copied_str;
rename emit_literal_str to emit_simple_literal_str.
(bpf_unparser::emit_str_arg): Rename emit_copied_str to emit_string_copy.
(translate_escapes): Takes a const string now.
* bpf-opt.cxx (alloc_literal_str): Rename emit_literal_str to emit_simple_literal_str.
* bpf-internal.h (emit_simple_literal_str): Renamed from emit_literal_str.
Misc cleanup/notes:
* bpf-internal.h (BPF_MAXSTRINGLEN): TODO Someday this will be increased.
(program::use_tmp_space): Assert to catch miscalulations.
* tapset/logging.stp (abort): TODO Could abort immediately with assembly in future.
Serhei Makarov [Tue, 23 Oct 2018 17:35:08 +0000 (13:35 -0400)]
stapbpf assembler WIP #6 :: other call functions ({s}printf and tapset)
Only very limited support for tapset functions (restricted to exit()
for now) due to the difficulty of resolving symbols after the semantic
pass is already completed. Could address this in the future.
* bpf_internal.h (program::use_tmp_space): check for overflow.
(globals::session): new field for systemtap_session (used by function lookup).
* bpf_translate.cxx (asm_stmt::has_jmp_target): new field.
(operator <<): printing rules for alloc, call.
(bpf_unparser::parse_asm_stmt): remove printf/error, BUGFIX alloc, call, string literal.
Also calculate has_jmp_target in the resulting stmt.
(bpf_unparser::visit_embeddedcode): handle printf, sprintf and exit().
Also fix the way fallthrough fields are populated to avoid spurious extra jump.
(bpf_unparser::emit_functioncall): new function. Factors out non-staptree code.
(bpf_unparser::visit_functioncall): use new emit_functioncall().
(print_format_add_tag): new function on std::string. Factors out string operations.
(bpf_unparser::emit_print_format): new function. Factors out non-staptree code.
(bpf_unparser::visit_print_format): use new emit_print_format().
(translate_bpf_pass): store session in globals.
Jafeer Uddin [Tue, 23 Oct 2018 19:19:29 +0000 (15:19 -0400)]
PR21080: support added for new pkey_* syscalls
* sysc_pkey_*.stp: new syscall probes
* aux_syscalls.stp: add new function to convert init_val to PKEY_DISABLE_[ACCESS|WRITE]
* compat_unistd.h: add new syscall numbers
* pkey.c: tests for new syscall
Stan Cox [Tue, 23 Oct 2018 02:29:32 +0000 (22:29 -0400)]
Use NSS_InitContext instead of NSS_Init.
* nsscommon.cxx (nssInitContext): New function which allows
multiple nss invocations. Change all callers except where
write access is required.
(nssCleanup): Add context parameter. Change all callers.
* client-http.cxx (download): Add cleanup parameter to choose
curl_easy_cleanup. Change all callers.
(download_pem_cert): If CURLINFO_CERTINFO fails then retrieve cert from server.
(find_and_connect_to_server): If the database cert fails then try again
with retrieved server cert.
(fill_in_server_info): Likewise.
* testsuite/systemtap.http_server/server_trust.exp: New test.
For demos like also_ran.stp, the incoming strings are usually already
quoted. Re-quoting them for prometheus labeling is counterproductive,
so we now offer an option to bypass that string_quoted() wrapping.
also_ran.stp updated.
William Cohen [Fri, 19 Oct 2018 18:59:27 +0000 (14:59 -0400)]
Use cast to make c->cycles_sum aways match the %lld format.
On aarch64 and ppc64le cycles_t is a slightly different type from the
x86_64 and does not match up with the %lld format. Cast c->cyles_sum
to always be (long long) to avoid the compile failing on aarch64 and
ppc64le with the following message:
**** failed systemtap kernel-devel smoke test:
/tmp/stapzubPmR/stap_e418199b88a6f8adf13a14e064ae79da_1403_src.c: In function '_stp_hrtimer_notify_function':
/tmp/stapzubPmR/stap_e418199b88a6f8adf13a14e064ae79da_1403_src.c:477:45: error: format '%lld' expects argument of type 'long long int', but argument 2 has type 'cycles_t' {aka 'long unsigned int'} [-Werror=format=]
_stp_error ("probe overhead (%lld cycles) exceeded threshold (%lld cycles) in last %lld cycles", c->cycles_sum, STP_OVERLOAD_THRESHOLD, STP_OVERLOAD_INTERVAL);
~~~^ ~~~~~~~~~~~~~
%ld
cc1: all warnings being treated as errors
Hou Tao [Mon, 10 Sep 2018 11:46:27 +0000 (19:46 +0800)]
Fix searching of kernel_source_tree for kernel built with O option
When generating kernel module for a systemtap script that uses trace-point
probe, if the vanilla kernel is built by using O=build_path option and
r=build_path option is passed to stap, stap will not be able to find
kernel_source_tree and will fail on pass-2.
Linux kernel will create a symlink named source to the source tree
for out-of-source build since (399b835be30e "kbuild: add a symlink
to the source for separate objdirs"), so fix the problem by checking
whether or not the symlink exists and using it as the kernel_source_tree.
Also using a new helper dir_exists() instead of file_exists() to
ensure the existence of the directory of source tree.
William Cohen [Fri, 12 Oct 2018 19:02:03 +0000 (15:02 -0400)]
Add fallback __NR_fork define
The aarch64 architecture doesn't have a fork syscall, so the code
needs to have a fallback __NR_fork define for the code to be
successfully compiled on aarch64.
Frank Ch. Eigler [Fri, 12 Oct 2018 18:33:04 +0000 (14:33 -0400)]
RHBZ1638874 workaround: let $SYSTEMTAP_SIGN override missing secureboot hint
On some Fedora kernels, EFI SecureBoot involves activating a lockdown
mode that prevents unsigned .ko's from loading. It doesn't appear to
give any userlevel sign that this is happening, so stap couldn't
activate its secureboot / stap-server machinery.
Until the kernel does inform us properly, we let users set an
environment variable to force UEFI MOK module signing.
William Cohen [Thu, 11 Oct 2018 18:05:46 +0000 (14:05 -0400)]
Adjust eatkmydata.stp to work with the newer Linux kernels and tapsets
The revised tapsets generally do not use the dwarf-based kprobes and
kretprobes for syscalls on newer 4.17 Linux kernels. As a result
target variables are not available and guru mode will not work with
them. The eatmydata.stp example has been modified to instrument the
common do_fsync function that both syscall.fsync and syscall.fdatasync
call. The target variable and guru mode will work on this function.
William Cohen [Thu, 11 Oct 2018 17:41:20 +0000 (13:41 -0400)]
Adjust iotime.stp to work with newer Linux kernels and sysetmtap tapset
Newer systems may use the openat syscall instead of the open syscall,
so need to monitor openat syscall. The probe points used by systemtap
for the syscalls do not have the target variable available via
@entry(), so need to code up iotime.stp to explicitly store values
available on syscall entry in associative arrays to have them
available for the syscall return probes.
William Cohen [Wed, 10 Oct 2018 18:08:19 +0000 (14:08 -0400)]
Avoid using target variable and @entry() in syscall.*.return for futex examples
The update to the syscall tapsets causes SystemTap to use non-dwarf
based probe points for syscalls on linux-4.17 and newer. The
non-dwarf based syscall instrumentation does not have the target
variables available, so the @entry() operation is not going to work.
The futex examples are revised to store the needed data into
associative arrays at syscall entry and accessed that data in the
syscall return.
rhbz1629623: adapt to CONFIG_HAVE_ARCH_PREL32_RELOCATIONS tracepoint enumeration
Kernel commit 46e0c9be206fa7b (4.19-rc1) makes the mod->tracepoints_ptrs[]
contain pointer-relative-offsets instead of normal pointers.
We must follow suit to avoid a crash.
William Cohen [Thu, 27 Sep 2018 14:48:14 +0000 (10:48 -0400)]
Change the tapset file name to get matching tapset::syscall_any manpage
The automated generation of man pages uses the file name to generate
the manpage names. Adjusted the name to get a matching tapset man
page for syscall_any{.return} probe points.
William Cohen [Fri, 21 Sep 2018 02:50:16 +0000 (22:50 -0400)]
Convert the various systemtap examples to use the syscall any tapset
To make the examples cleaner use the new syscall any tapset. This avoids
exposing the systemtap internal function _stp_syscall_nr() and makes the
instrumentation look a bit more like the traditional syscall.* probe points.
William Cohen [Fri, 21 Sep 2018 00:33:17 +0000 (20:33 -0400)]
Add the syscall_any and syscall_any.return probe points
The syscall.*{.return} and np_syscall.*.{.return} end up expanding to
large amount of code that takes a signficant amount of time to
compile. The resulting kernel module also takes a fair amount of time
to install and remove the instrumentation when it starts and shuts
down. For instrumentation don't really care about the details of the
syscall arguments it would be preferable to use the sys_enter and sys_exit
tracepoints to more efficiently probe the one or two places.
Using tp_syscall.*{.return} end up generating a lot of code to
determine which of the hundreds of syscall is being used and then runs
the same handler. The syscall_any and syscall_any.return eliminate
that undesired overhead by just looking up the syscall name in a
table.
William Cohen [Mon, 17 Sep 2018 20:58:09 +0000 (16:58 -0400)]
Use sys_enter and sys_exit tracepoints in place of syscall.*{.return}
The common probe point idiom of syscall.* and syscall.*.return can be
replaced with equivalent sys_enter and sys_exit tracepoints for a
number of the example scripts. The advantages are:
-Quicker compilation of the script into instrumenation
-Smaller kernels modules for the instrumentation
-Lower overhead for probe points
This changes are not applicable to all uses use syscall.* and
syscall.*.return. The predefined variable such as argstr are not
available for the sys_enter and sys_exit trace points.
Some of the revised examples are using the internal _stp_syscall_nr())
function. A user visible version of this function should be
available.
PR23666 Fix a bug in semantic analysis of aggregate operators in foreach sorting
When aggregate operators like @count, @sum, and etc were used in the
foreach loop sorting criteria but not in the foreach loop body, then
these sorting criteria were not respected by the translator in the
generated code.
This bug affected both the kernel and dyninst runtime modes.
William Cohen [Fri, 14 Sep 2018 18:13:08 +0000 (14:13 -0400)]
Ignore the error value returned by the find command for the slowvfs.stp
Some files in /proc are unreadable by normal users. When the find
command encounters these files find returns a non-zero value on exit.
The test is just using the find to create some load and really wants
to discard the error result otherwise the install test for slowvfs.stp
will fail.
PR23160,PR14690: convert syscall.*.return aliases to @SYSC_SETRETVAL retval/retstr
Adjust all syscall.*.return aliases to a new macro for provision of new
retval, old retstr, and the temporary returnval() compatibility hack.
All hail /bin/sed, mother of /bin/ed, which made this operation bearable.
PR23160,PR14690: prep returnval() for an extra side-channel of data
To permit tracepoint-based syscall probe-aliases to provide return values
to scripts, returnval() needs an extension. This patch adds a pair of
new values to the context, conditional on version <= 4.0. (The retval
value will be the next better approach, coming in a followup patch.)
testsuite: prepare for working tp_syscall.exp suite
There were some typos in the driver .stp script that precluded
operation, and exposed a latent bug in how another .exp file
transcribed outputs into the log file.
Revert "tapset/errno.stp: learn about CONTEXT->sregs"
This reverts commit 8038562cafe852681eda1a45c02b5f76070d3dec.
jafeer and wcohen right note that this papered over the real
problem, which is that a function like returnval() has no way
of accessing tracepoint parameters like sys_exit's $ret,
even if given CONTEXT->sregs. Need to rethink.