Bug 11352

Summary: sdt.exp kprobe tests FAIL with SEGV
Product: systemtap Reporter: Mark Wielaard <mark>
Component: runtimeAssignee: Josh Stone <jistone>
Status: RESOLVED FIXED    
Severity: normal CC: jistone, scox
Priority: P2    
Version: unspecified   
Target Milestone: ---   
Host: Target:
Build: Last reconfirmed:

Description Mark Wielaard 2010-03-06 15:46:44 UTC
New x86_64 fedora 12 kernel based on 2.6.23.9 makes sdt.exp kprobes fail with a
SEGV in the child process:

FAIL: sdt "" kprobe
FAIL: sdt c++ "" kprobe
FAIL: sdt additional_flags=-std=gnu89 kprobe
FAIL: sdt c++ additional_flags=-std=gnu89 kprobe
FAIL: sdt additional_flags=-ansi kprobe
FAIL: sdt c++ additional_flags=-ansi kprobe
FAIL: sdt additional_flags=-pedantic kprobe
FAIL: sdt c++ additional_flags=-pedantic kprobe
FAIL: sdt additional_flags=-ansi additional_flags=-pedantic kprobe
FAIL: sdt c++ additional_flags=-ansi additional_flags=-pedantic kprobe
FAIL: sdt additional_flags=-O2 kprobe
FAIL: sdt c++ additional_flags=-O2 kprobe
FAIL: sdt additional_flags="-O3" kprobe
FAIL: sdt c++ additional_flags="-O3" kprobe
FAIL: sdt_misc (0) kprobe

All non-kprobe based variants PASS. All kprobe based variants fail similarly:

Executing on host: gcc /home/mark/src/systemtap/testsuite/systemtap.base/sdt.c  
-g -I/home/mark/src/systemtap/testsuite/../includes -Wall -Wextra -Werror -DEXPE
RIMENTAL_KPROBE_SDT  -lm   -o sdt.c.exe.0    (timeout = 300)
PASS: compiling sdt.c "" kprobe
executing: stap /home/mark/src/systemtap/testsuite/systemtap.base/sdt.stp
sdt.c.exe.0 -c ./sdt.c.exe.0
FAIL: sdt "" kprobe
line 1: expected "1"
Got "child killed: segmentation violation"
Executing on host: gcc /home/mark/src/systemtap/testsuite/systemtap.base/sdt.c 
-g -I/home/mark/src/systemtap/testsuite/../includes -Wall -Werror -x c++
-DEXPERIMENTAL_KPROBE_SDT  -lm   -o sdt.cxx.exe.0    (timeout = 300)
PASS: compiling sdt.c c++ "" kprobe
executing: stap /home/mark/src/systemtap/testsuite/systemtap.base/sdt.stp
sdt.cxx.exe.0 -c ./sdt.cxx.exe.0
FAIL: sdt c++ "" kprobe
line 1: expected "1"
Got "child killed: segmentation violation"
Comment 1 Mark Wielaard 2010-03-06 15:56:27 UTC
The crash is in stap itself, has nothing to do with kernel (upgrade).

Program received signal SIGSEGV, Segmentation fault.
operator<< (o=..., t=...) at /home/mark/src/systemtap/parse.cxx:104
104	  o << tt2str(t.type);
(gdb) bt
#0  operator<< (o=..., t=...) at /home/mark/src/systemtap/parse.cxx:104
#1  0x000000000047b2d2 in std::basic_string<char, std::char_traits<char>,
std::allocator<char> > lex_cast_qstring<token>(token const&) ()
#2  0x00000000004617f3 in c_unparser::record_actions (this=0x7fffffffbd20, 
    actions=<value optimized out>, tok=0x0, update=<value optimized out>)
    at /home/mark/src/systemtap/translate.cxx:2332
#3  0x0000000000461c23 in c_unparser::visit_next_statement (
    this=0x7fffffffbd20, s=0x23415e0)
    at /home/mark/src/systemtap/translate.cxx:2874
#4  0x0000000000462513 in c_unparser::visit_if_statement (this=0x7fffffffbd20, 
    s=0x2341530) at /home/mark/src/systemtap/translate.cxx:2444
#5  0x0000000000457235 in c_unparser::visit_block (this=0x7fffffffbd20, 
    s=0x2341500) at /home/mark/src/systemtap/translate.cxx:2350
#6  0x0000000000464df4 in c_unparser::emit_probe (this=0x7fffffffbd20, v=
    0x2335e80) at /home/mark/src/systemtap/translate.cxx:1702
#7  0x0000000000460732 in translate_pass (s=...)
    at /home/mark/src/systemtap/translate.cxx:5418
#8  0x0000000000410012 in main (argc=<value optimized out>, 
    argv=<value optimized out>) at /home/mark/src/systemtap/main.cxx:1354
Comment 2 Mark Wielaard 2010-03-06 16:02:41 UTC
go up some frames and find a lex_cast_qstring(NULL) that triggers this (because
tok == NULL):

#2  0x00000000004617f3 in c_unparser::record_actions (this=0x7fffffffbd20, 
    actions=<value optimized out>, tok=0x0, update=<value optimized out>)
    at /home/mark/src/systemtap/translate.cxx:2332
2332	      o->newline() << "c->last_stmt = " << lex_cast_qstring(*tok) << ";";
(gdb) list
2327	  if ((update && action_counter > 0) || action_counter >= 10/*<-arbitrary*/)
2328	    {
2329	      o->newline() << "c->actionremaining -= " << action_counter << ";";
2330	      o->newline() << "if (unlikely (c->actionremaining <= 0)) {";
2331	      o->newline(1) << "c->last_error = \"MAXACTION exceeded\";";
2332	      o->newline() << "c->last_stmt = " << lex_cast_qstring(*tok) << ";";
2333	      o->newline() << "goto out;";
2334	      o->newline(-1) << "}";
2335	      action_counter = 0;
2336	    }

Introduced by:

commit 54975cd8f5844cca2cc0a32ab26fd2b546a81984
Author: Josh Stone <jistone@redhat.com>
Date:   Fri Mar 5 17:24:41 2010 -0800

    PR11342: Ensure that last_stmt is always set
    
    * translate.cxx (c_unparser::record_actions): Take a token to fill in
      last_stmt for MAXACTION errors.
      (c_unparser::*): Pass a token to record_actions.
      (c_unparser::visit_try_block): The try/catch blocks are also basic
      blocks, so they should flush record_actions when done.
      (c_unparser_assignment::c_assignop): Set last_stmt for /= and %=.
      (c_unparser::visit_foreach_loop): Set last_stmt for agg. overflow.
      (c_unparser::visit_print_format): Set last_stmt for %m errors.
Comment 3 Josh Stone 2010-03-06 21:26:04 UTC
We really should assert that ->tok is always set, which we sometimes have
forgotten in manually synthesized elements.  I've fixed those I found in SDT,
but I also made record_actions lenient for now.

commit 63ea4244cd378a756017c66085bdf45d4ce95df4