Bug 25265

Summary: tapscripts using ustack, ubacktrace etc fail to compile on kernel 5.3
Product: systemtap Reporter: Craig Ringer <craig.ringer>
Component: runtimeAssignee: Unassigned <systemtap>
Status: RESOLVED FIXED    
Severity: normal CC: fche
Priority: P2    
Version: unspecified   
Target Milestone: ---   
See Also: https://sourceware.org/bugzilla/show_bug.cgi?id=25266
https://sourceware.org/bugzilla/show_bug.cgi?id=25267
Host: Target:
Build: Last reconfirmed:
Attachments: Fix invalid prototype in autoconf-stack-trace-save-regs.c test

Description Craig Ringer 2019-12-10 05:42:23 UTC
latest systemtap doesn't appear to produce correct code for tapscripts that use the ustack(), ubacktrace() or print_ubacktrace() functions when running on kernel 3.4.

Observed on Fedora 31 with

$ lsb_release  -a
LSB Version:	:core-4.1-amd64:core-4.1-noarch
Distributor ID:	Fedora
Description:	Fedora release 31 (Thirty One)
Release:	31
Codename:	ThirtyOne

$ uname -r
5.3.7-301.fc31.x86_64

$ git describe --tags
release-4.2-6-g0c5c0f434

Looks related to issue https://sourceware.org/bugzilla/show_bug.cgi?id=24923

Also seen with the current systemtap bundled in Fedora 31, systemtap-4.2-1.fc31.x86_64 .

Reported against Fedora as https://bugzilla.redhat.com/show_bug.cgi?id=1781471

Reproduce with:

sudo stap -v -e 'probe process("/lib64/libc.so.6").function("fsync") { print_ubacktrace(); }'

```
Pass 1: parsed user script and 476 library scripts using 305652virt/88592res/6272shr/82584data kb, in 210usr/30sys/244real ms.
Pass 2: analyzed script: 1 probe, 1 function, 0 embeds, 0 globals using 309084virt/92976res/7216shr/86016data kb, in 10usr/0sys/13real ms.
Pass 3: translated to C into "/tmp/stapbc7qLa/stap_e7d835b94419e48f6773a1ccc34b4fb6_1420_src.c" using 309084virt/93232res/7472shr/86016data kb, in 50usr/120sys/181real ms.
In file included from /tmp/stapbc7qLa/stap_e7d835b94419e48f6773a1ccc34b4fb6_1420_src.c:85:
/usr/local/share/systemtap/runtime/stack.c:66:14: error: ‘struct stack_trace’ declared inside parameter list will not be visible outside of this definition or declaration [-Werror]
   66 |       struct stack_trace *trace);
      |              ^~~~~~~~~~~
/usr/local/share/systemtap/runtime/stack.c: In function ‘_stp_stack_print_fallback’:
/usr/local/share/systemtap/runtime/stack.c:202:21: error: storage size of ‘trace’ isn’t known
  202 |  struct stack_trace trace;
      |                     ^~~~~
In file included from /usr/local/share/systemtap/runtime/unwind.c:16,
                 from /usr/local/share/systemtap/runtime/linux/runtime.h:255,
                 from /usr/local/share/systemtap/runtime/runtime.h:26,
                 from /tmp/stapbc7qLa/stap_e7d835b94419e48f6773a1ccc34b4fb6_1420_src.c:27:
/usr/local/share/systemtap/runtime/unwind/unwind.h: In function ‘read_ptr_sect’:
/usr/local/share/systemtap/runtime/unwind/unwind.h:146:20: error: this statement may fall through [-Werror=implicit-fallthrough=]
  146 |   if (!compat_task || (compat_task && (tableSize == 4 || tableSize == 0)))
      |       ~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/usr/local/share/systemtap/runtime/unwind/unwind.h:157:2: note: here
  157 |  case DW_EH_PE_data8:
      |  ^~~~
In file included from ./include/asm-generic/bug.h:5,
                 from ./arch/x86/include/asm/bug.h:83,
                 from ./include/linux/bug.h:5,
                 from ./include/linux/mmdebug.h:5,
                 from ./include/linux/gfp.h:5,
                 from /usr/local/share/systemtap/runtime/linux/runtime_defines.h:20,
                 from /usr/local/share/systemtap/runtime/runtime_defines.h:8,
                 from /tmp/stapbc7qLa/stap_e7d835b94419e48f6773a1ccc34b4fb6_1420_src.c:11:
./include/linux/compiler.h:328:5: error: this statement may fall through [-Werror=implicit-fallthrough=]
  328 |  do {        \
      |     ^
./include/linux/compiler.h:338:2: note: in expansion of macro ‘__compiletime_assert’
  338 |  __compiletime_assert(condition, msg, prefix, suffix)
      |  ^~~~~~~~~~~~~~~~~~~~
./include/linux/compiler.h:350:2: note: in expansion of macro ‘_compiletime_assert’
  350 |  _compiletime_assert(condition, msg, __compiletime_assert_, __LINE__)
      |  ^~~~~~~~~~~~~~~~~~~
./include/linux/build_bug.h:39:37: note: in expansion of macro ‘compiletime_assert’
   39 | #define BUILD_BUG_ON_MSG(cond, msg) compiletime_assert(!(cond), msg)
      |                                     ^~~~~~~~~~~~~~~~~~
./include/linux/build_bug.h:50:2: note: in expansion of macro ‘BUILD_BUG_ON_MSG’
   50 |  BUILD_BUG_ON_MSG(condition, "BUILD_BUG_ON failed: " #condition)
      |  ^~~~~~~~~~~~~~~~
/usr/local/share/systemtap/runtime/unwind/unwind.h:158:3: note: in expansion of macro ‘BUILD_BUG_ON’
  158 |   BUILD_BUG_ON(sizeof(u64) != sizeof(value));
      |   ^~~~~~~~~~~~
In file included from /usr/local/share/systemtap/runtime/unwind.c:16,
                 from /usr/local/share/systemtap/runtime/linux/runtime.h:255,
                 from /usr/local/share/systemtap/runtime/runtime.h:26,
                 from /tmp/stapbc7qLa/stap_e7d835b94419e48f6773a1ccc34b4fb6_1420_src.c:27:
/usr/local/share/systemtap/runtime/unwind/unwind.h:163:2: note: here
  163 |  case DW_EH_PE_absptr:
      |  ^~~~
In file included from /usr/local/share/systemtap/runtime/linux/runtime.h:255,
                 from /usr/local/share/systemtap/runtime/runtime.h:26,
                 from /tmp/stapbc7qLa/stap_e7d835b94419e48f6773a1ccc34b4fb6_1420_src.c:27:
/usr/local/share/systemtap/runtime/unwind.c: In function ‘processCFI’:
/usr/local/share/systemtap/runtime/unwind.c:519:8: error: this statement may fall through [-Werror=implicit-fallthrough=]
  519 |     if (compat_task) {
      |        ^
/usr/local/share/systemtap/runtime/unwind.c:531:4: note: here
  531 |    case DW_CFA_def_cfa_offset:
      |    ^~~~
/usr/local/share/systemtap/runtime/unwind.c:543:8: error: this statement may fall through [-Werror=implicit-fallthrough=]
  543 |     if (compat_task) {
      |        ^
/usr/local/share/systemtap/runtime/unwind.c:553:4: note: here
  553 |    case DW_CFA_def_cfa_offset_sf:
      |    ^~~~
cc1: all warnings being treated as errors
make[1]: *** [scripts/Makefile.build:280: /tmp/stapbc7qLa/stap_e7d835b94419e48f6773a1ccc34b4fb6_1420_src.o] Error 1
make: *** [Makefile:1630: _module_/tmp/stapbc7qLa] Error 2
WARNING: kbuild exited with status: 2
Pass 4: compiled C into "stap_e7d835b94419e48f6773a1ccc34b4fb6_1420.ko" in 10710usr/1850sys/12803real ms.
Pass 4: compilation failed.  [man error::pass4]
```
Comment 1 Craig Ringer 2019-12-10 06:05:48 UTC
This probably relates to this kernel patch: https://patchwork.kernel.org/patch/10916651/ or the series it's part of like https://patchwork.kernel.org/patch/10916613/ 

I wonder if this is a redhat-ism (some local patch).

The runtime looks like it already understands that "struct stack_trace" went away in Linux 5.2, given runtime/linux/autoconf-stack-trace-save-regs.c and the ifdef for STAPCONF_STACK_TRACE_SAVE_REGS in runtime/stack.c .

I checked the generated module with stap -k. The generated header stapconf_458f21c1e2c146ca5cc99e95113a4f8b_799.h does not contain STAPCONF_STACK_TRACE_SAVE_REGS .

Tweaking the Makefile so it doesn't swallow output of the configure tests (surely those should go to a log?) shows the following error:

```
make -f ./scripts/Makefile.build obj=/tmp/stapsiNon3  /tmp/stapsiNon3/stap_767845_src.i
/usr/local/share/systemtap/runtime/linux/autoconf-stack-trace-save-regs.c:3:14: error: function declaration isn’t a prototype [-Werror=strict-prototypes]
    3 | unsigned int foo ()
      |              ^~~
cc1: all warnings being treated as errors
```

When I fix that by adding a prototype to /usr/local/share/systemtap/runtime/linux/autoconf-stack-trace-save-regs.c 

```
unsigned int foo(void);
```

and remove the generated header then re-make, the generated header now includes STAPCONF_STACK_TRACE_SAVE_REGS:

```
/tmp/stapsiNon3# grep -r STAPCONF_STACK_TRACE_SAVE_REGS
stapconf_458f21c1e2c146ca5cc99e95113a4f8b_799.h:#define STAPCONF_STACK_TRACE_SAVE_REGS 1
```

... and the build fails at a later step due to `-Werror`.

So in short, the configure test fails due to `-Werror` and a missing prototype, causing the runtime to fail to detect the new stack API in the kernel.
Comment 2 Craig Ringer 2019-12-10 06:34:31 UTC
A workaround if you just want to use stap is to patch `runtime.cxx` as follows

```
diff --git a/buildrun.cxx b/buildrun.cxx
index 505902bc5..b29eeb797 100644
--- a/buildrun.cxx
+++ b/buildrun.cxx
@@ -235,6 +235,7 @@ compile_dyninst (systemtap_session& s)
       "gcc", "--std=gnu99", s.translated_source, "-o", module,
       "-fvisibility=hidden", "-O2", "-I" + s.runtime_path, "-D__DYNINST__",
       "-Wall", WERROR, "-Wno-unused", "-Wno-strict-aliasing",
+      "-Wno-error=implicit-fallthrough", "-Wno-error=strict-prototypes",
       "-pthread", "-lrt", "-fPIC", "-shared",
     };
 
```

then recompile and reinstall.
Comment 3 Craig Ringer 2019-12-10 06:51:17 UTC
Created attachment 12116 [details]
Fix invalid prototype in autoconf-stack-trace-save-regs.c test
Comment 4 Craig Ringer 2019-12-10 06:51:55 UTC
Fix title. I have no idea why I wrote 3.4 instead of 5.3.
Comment 5 Craig Ringer 2019-12-29 05:33:33 UTC
Any thoughts on applying this bugfix patch?
Comment 6 Frank Ch. Eigler 2019-12-29 19:52:57 UTC
thanks, merged!