Bug 6954 - ++ operations not triggering automatic global printing
Summary: ++ operations not triggering automatic global printing
Status: RESOLVED FIXED
Alias: None
Product: systemtap
Classification: Unclassified
Component: translator (show other bugs)
Version: unspecified
: P2 normal
Target Milestone: ---
Assignee: Unassigned
URL:
Keywords:
: 11854 (view as bug list)
Depends on:
Blocks: 11695
  Show dependency treegraph
 
Reported: 2008-10-09 13:13 UTC by Frank Ch. Eigler
Modified: 2011-07-12 13:41 UTC (History)
2 users (show)

See Also:
Host:
Target:
Build:
Last reconfirmed:


Attachments
simple patch (317 bytes, patch)
2010-01-27 08:03 UTC, Wenji Huang
Details | Diff
initial patch (1.39 KB, patch)
2010-08-16 19:36 UTC, Stan Cox
Details | Diff
detect unread rw global vars (1.62 KB, patch)
2011-06-24 20:04 UTC, Stan Cox
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Frank Ch. Eigler 2008-10-09 13:13:50 UTC
Further to bug #5686, code such as this should also trigger the automatic
global value printing logic:

  probe foo { var ++ }
  global var

but it doesn't, because the tree analysis visitor used for this purpose
treats the "++" operation as both read and write of the variable.  This
may be true for locking or some other purposes, but given that the value
is thrown away, it really is just a "write".  Therefore it should trigger
automagic printing.
Comment 1 Wenji Huang 2010-01-27 08:03:42 UTC
Created attachment 4556 [details]
simple patch

Simple patch. Regards ++/-- as pure writes.

$stap  -e 'global var probe begin{ var++  exit() }'
var=0x1
Comment 2 Wenji Huang 2010-01-28 06:26:09 UTC
commit 982b88bd950381434a8325e550eff9df59f59dcd
Comment 3 Frank Ch. Eigler 2010-03-01 01:39:40 UTC
Committed.
FWIW, I wonder why we didn't just go this route earlier.  I worry
that pretending that ++ only writes to its argument may mess up
some optimization, I can't think of one right now.  We'll keep
this special case in mind.
Comment 4 Mark Wielaard 2010-05-12 12:44:26 UTC
This caused a regression as noted in PR11590. I added a new test for that case,
and extracted the testcase for this PR into its own test, that now xfails.

commit d1cf0b627cfc9002b54c2ad40507de9ee6a2ce90
Author: Mark Wielaard <mjw@redhat.com>
Date:   Wed May 12 14:39:55 2010 +0200

    Add testcase for PR11590 optimized pre/postfix operators
    
    * testsuite/systemtap.base/prepost_optim.stp: New test.
    * testsuite/systemtap.base/prepost_optim.exp: Harness.

commit ad8c38ffc62200d6ba6ebc0b0339967cdd84669a
Author: Mark Wielaard <mjw@redhat.com>
Date:   Wed May 12 14:31:59 2010 +0200

    Add testcase for PR6954 ++ operations not triggering automatic global printing
    
    * testsuite/systemtap.base/global_end_var.stp: New test.
    * testsuite/systemtap.base/global_end.exp: Add xfail.

commit 980759f3da625c5697440efc1bfda046c247866e
Author: Mark Wielaard <mjw@redhat.com>
Date:   Wed May 12 14:16:26 2010 +0200

    PR11590 Revert "PR6954: make ++/-- operation trigger automatic global printing"
    
    This reverts commit 982b88bd950381434a8325e550eff9df59f59dcd.
    
    Pretending that pre/postfix ++/-- operations only writes to its argument
    messes up the optimization of a variable that is assigned and then
    only used through post/prefix ++/--operators.
Comment 5 Josh Stone 2010-05-12 17:16:27 UTC
We should include other forms of RMW too, like:

  probe foo { var *= 2 }
  global var
Comment 6 Frank Ch. Eigler 2010-07-29 12:15:38 UTC
*** Bug 11854 has been marked as a duplicate of this bug. ***
Comment 7 Stan Cox 2010-08-03 02:19:30 UTC
This seems to fix it:

* staptree.h (varuse_collecting_visitor): Add readmodwrite.
* staptree.cxx (varuse_collecting_visitor::visit_symbol): Set readmodwrite.
* elaborate.cxx (add_global_var_display): Take into account readmodwrite.

diff --git a/elaborate.cxx b/elaborate.cxx
@@ -1357,4 +1357,5 @@ void add_global_var_display (systemtap_session& s)
       vardecl* l = s.globals[g];
-      if (vut.read.find (l) != vut.read.end()
-          || vut.written.find (l) == vut.written.end())
+      if ((vut.read.find (l) != vut.read.end()
+	   || vut.written.find (l) == vut.written.end())
+	  && vut.readmodwrite.find (l) == vut.readmodwrite.end())
 	continue;
diff --git a/staptree.cxx b/staptree.cxx
index 37416e3..0056795 100644
--- a/staptree.cxx
+++ b/staptree.cxx
@@ -2049,3 +2049,3 @@ varuse_collecting_visitor::visit_symbol (symbol *e)
 
-  if (current_lvalue == e || current_lrvalue == e)
+  if (current_lvalue == e)
     {
@@ -2054,2 +2054,7 @@ varuse_collecting_visitor::visit_symbol (symbol *e)
     }
+  else if (current_lrvalue == e)
+    {
+      written.insert (e->referent);
+      readmodwrite.insert (e->referent);
+    }
   if (current_lvalue != e || current_lrvalue == e)
diff --git a/staptree.h b/staptree.h
index 647aa0e..9680b55 100644
--- a/staptree.h
+++ b/staptree.h
@@ -836,2 +836,3 @@ struct varuse_collecting_visitor: public
functioncall_traversing_visitor
   std::set<vardecl*> read;
+  std::set<vardecl*> readmodwrite;
   std::set<vardecl*> written;
Comment 8 Frank Ch. Eigler 2010-08-04 17:41:30 UTC
I think we need an extension to the varuse_collecting_visitor that
communicates whether the current expression is being evaluated for
side-effects only, or whether its value is being used as a real
rvalue.  So, for a 'global g', 

(a)   probe foo { println (g ++)) }
(b)   probe bar { g ++ }

only (b) should result in g auto-printing.  It doesn't relate exactly to
whether the ++ was read-modify-write.

Also remember stap -u (unoptimized mode) should probably
work identically with the global auto-printing, which means that
at translate time, we can't assume that the only remaining expression
nodes are useful-or-sideeffecty.
Comment 9 Stan Cox 2010-08-16 19:36:39 UTC
Created attachment 4935 [details]
initial patch
Comment 10 Stan Cox 2010-08-16 19:38:14 UTC
attachment doesn't answer #8 but answers the initial issue.  Posted for future
reference.
Comment 11 Stan Cox 2011-06-24 20:04:52 UTC
Created attachment 5819 [details]
detect unread rw global vars
Comment 12 Stan Cox 2011-06-24 20:06:06 UTC
Attachment #11 [details] detects 
 gs1 = gs1 + 2;
 gs2 += 2;
 ++ gs3;
 gs4 ++;
Comment 13 Stan Cox 2011-06-24 20:08:58 UTC
Hmm, didn't realize bugzilla would do that, I actually meant attachment 5819 [details]
Comment 14 Frank Ch. Eigler 2011-06-25 12:46:09 UTC
It's a good start, but my guess is more & different code
will be needed.

lvalue_read is probably the right sort of conditional, but
it needs to be protected against nesting - like last_lvalue
etc. are.

    foo = (baz ++) & (noo = (zoo ++))
or such crazy stuff.

Assignment and printing are not that special cases for using
the results of an A++ type operation.  Anything that reads the
result from a read-modify-write is relevant, e.g.

    foo = bar + baz++;
    factorial (baz++);
Comment 15 Stan Cox 2011-06-28 18:17:34 UTC
Can unary expressions, that are components of the rhs of an assignment, be candidates for global display?  Seems like they will always candidates for getting their value read.  Statement expressions where the expression is a unary expression, seem like the primary candidates for global display.
stmt := assgn | for | if | return | while | printf | expr
assgn := operand assgnop expr
for := for (expr; expr; expr) stmt
if = if (expr) stmt [ else stmt ]
return expr
while (expr) stmt
printf (operand[, operand, ...])

expr := terexpr | binexpr | unexpr1 | unexpr2
terexpr := cond ? expr : expr
binexpr := operand binop operand 
unexpr1 := unop operand
unexpr2 := operand unop
operand := funcall | variable | numlit
funcall := fn ([ arg1, arg2, ... ])
Comment 16 Frank Ch. Eigler 2011-07-08 11:18:48 UTC
OK, the code still looks a bit unclear to me as to why it avoids
the earlier optimizer problems (bug #11590), or whether it may
create new ones.  Perhaps some more comments & tests could help
clarify.

For example: the varuse_collecting_visitor::visit_symbol simply
must produce correct results in the read/write sets, for the script
to be translated correctly.  It would be good to spell out 
a variety of before/after-patch cases as comment blocks in the code
or in email.

As to the (symbol*)current_lvalue construct, it would be much better
not to rely on that.  (There's no explanation why that's even used, 
and I'm worried it hides a problem.)  Or if one absolutely must, at
least type-check it at run time with something like
   assert(dynamic_cast<symbol*>(current_lvalue));

The test cases should include cases such as

   println(foo++)          # no printing
   script_function(bar++)  # no printing
Comment 17 Stan Cox 2011-07-08 18:38:31 UTC
11590 patch
 ++x x++ changes use of current_lvalue to instead use current_lrvalue
     	 (in visit_pre_crement, visit_post_crement)
6954 patch
 ++x x++ no change to current_lvalue and current_lrvalue
     	 instead adds current_last_lvalue to assist setting read/write sets

The purpose of the following check:
 (current_lvalue == 0 || ((symbol*)current_lvalue)->referent != e->referent)
is to check "is this the same as the lhs symbol?" to catch situations like:
 x += 1; x = x + 1; 
This is in reference to comment 5 and is indeed a bit baroque.

Here are a few comparisons of read/written sets before and after the 6954 patch:

gs1 = gs1 + 2;
  gs1 read set: gs0 
  gs1 written set: gs0_save gs1 temp 
- gs1 read set: gs0 gs1 
+ gs1 read set: gs0 
  gs1 written set: gs0_save gs1 temp 
gs2 += 2;
- gs2 read set: gs0 gs1 gs2 
+ gs2 read set: gs0 
  gs2 written set: gs0_save gs1 gs2 temp 
++ gs3;
- gs3 read set: gs0 gs1 gs2 gs3 
+ gs3 read set: gs0 
  gs3 written set: gs0_save gs1 gs2 gs3 temp 
gs4 ++;
- gs4 read set: gs0 gs1 gs2 gs3 gs4 
+ gs4 read set: gs0 
  gs4 written set: gs0_save gs1 gs2 gs3 gs4 temp 
gs5 ++;
- gs5 read set: gs0 gs1 gs2 gs3 gs4 gs5 
+ gs5 read set: gs0 
  gs5 written set: gs0_save gs1 gs2 gs3 gs4 gs5 temp 
gs6 = gs6 + temp;
- gs6 read set: gs0 gs1 gs2 gs3 gs4 gs5 
+ gs6 read set: gs0 
  gs6 written set: gs0_save gs1 gs2 gs3 gs4 gs5 gs6 temp 
- gs6 read set: gs0 gs1 gs2 gs3 gs4 gs5 gs6 
+ gs6 read set: gs0 
  gs6 written set: gs0_save gs1 gs2 gs3 gs4 gs5 gs6 temp 
- temp read set: gs0 gs1 gs2 gs3 gs4 gs5 gs6 temp 
+ temp read set: gs0 temp 
  temp written set: gs0_save gs1 gs2 gs3 gs4 gs5 gs6 temp 
printf ("gs7=%d ",fna(gs7)) 
- gs7 read set: gs0 gs1 gs2 gs3 gs4 gs5 gs6 gs7 temp 
+ gs7 read set: gs0 gs7 temp 
  gs7 written set: gs0_save gs1 gs2 gs3 gs4 gs5 gs6 temp 
- a read set: gs0 gs1 gs2 gs3 gs4 gs5 gs6 gs7 a temp 
+ a read set: gs0 gs7 a temp 
  a written set: gs0_save gs1 gs2 gs3 gs4 gs5 gs6 temp 
gtemp = (gs8++) & (gtemp = (gs9++))
- gtemp read set: gs0 gs1 gs2 gs3 gs4 gs5 gs6 gs7 a temp 
+ gtemp read set: gs0 gs7 a temp 
  gtemp written set: gs0_save gs1 gs2 gs3 gs4 gs5 gs6 gtemp temp 
- gs8 read set: gs0 gs1 gs2 gs3 gs4 gs5 gs6 gs7 gs8 a temp 
+ gs8 read set: gs0 gs7 gs8 a temp 
  gs8 written set: gs0_save gs1 gs2 gs3 gs4 gs5 gs6 gs8 gtemp temp 
- gtemp read set: gs0 gs1 gs2 gs3 gs4 gs5 gs6 gs7 gs8 a temp 
+ gtemp read set: gs0 gs7 gs8 a temp 
  gtemp written set: gs0_save gs1 gs2 gs3 gs4 gs5 gs6 gs8 gtemp temp 
- gs9 read set: gs0 gs1 gs2 gs3 gs4 gs5 gs6 gs7 gs8 gs9 a temp 
+ gs9 read set: gs0 gs7 gs8 gs9 a temp 
  gs9 written set: gs0_save gs1 gs2 gs3 gs4 gs5 gs6 gs8 gs9 gtemp temp 
ls0 = gtemp + gs10++;
- ls0 read set: gs0 gs1 gs2 gs3 gs4 gs5 gs6 gs7 gs8 gs9 a temp 
+ ls0 read set: gs0 gs7 gs8 gs9 a temp 
  ls0 written set: gs0_save gs1 gs2 gs3 gs4 gs5 gs6 gs8 gs9 gtemp temp ls0 
- gtemp read set: gs0 gs1 gs2 gs3 gs4 gs5 gs6 gs7 gs8 gs9 gtemp a temp 
+ gtemp read set: gs0 gs7 gs8 gs9 gtemp a temp 
  gtemp written set: gs0_save gs1 gs2 gs3 gs4 gs5 gs6 gs8 gs9 gtemp temp ls0 
- gs10 read set: gs0 gs1 gs2 gs3 gs4 gs5 gs6 gs7 gs8 gs9 gs10 gtemp a temp 
+ gs10 read set: gs0 gs7 gs8 gs9 gs10 gtemp a temp 
  gs10 written set: gs0_save gs1 gs2 gs3 gs4 gs5 gs6 gs8 gs9 gs10 gtemp temp ls0 
println(gs11++)
- gs11 read set: gs0 gs1 gs2 gs3 gs4 gs5 gs6 gs7 gs8 gs9 gs10 gs11 gtemp a temp 
+ gs11 read set: gs0 gs7 gs8 gs9 gs10 gs11 gtemp a temp 
  gs11 written set: gs0_save gs1 gs2 gs3 gs4 gs5 gs6 gs8 gs9 gs10 gs11 gtemp temp ls0
Comment 18 Stan Cox 2011-07-12 13:41:57 UTC
commit: 9d8a6ba36b
++x x++ x+=N are now supported