Further to bug #5686, code such as this should also trigger the automatic global value printing logic: probe foo { var ++ } global var but it doesn't, because the tree analysis visitor used for this purpose treats the "++" operation as both read and write of the variable. This may be true for locking or some other purposes, but given that the value is thrown away, it really is just a "write". Therefore it should trigger automagic printing.
Created attachment 4556 [details] simple patch Simple patch. Regards ++/-- as pure writes. $stap -e 'global var probe begin{ var++ exit() }' var=0x1
commit 982b88bd950381434a8325e550eff9df59f59dcd
Committed. FWIW, I wonder why we didn't just go this route earlier. I worry that pretending that ++ only writes to its argument may mess up some optimization, I can't think of one right now. We'll keep this special case in mind.
This caused a regression as noted in PR11590. I added a new test for that case, and extracted the testcase for this PR into its own test, that now xfails. commit d1cf0b627cfc9002b54c2ad40507de9ee6a2ce90 Author: Mark Wielaard <mjw@redhat.com> Date: Wed May 12 14:39:55 2010 +0200 Add testcase for PR11590 optimized pre/postfix operators * testsuite/systemtap.base/prepost_optim.stp: New test. * testsuite/systemtap.base/prepost_optim.exp: Harness. commit ad8c38ffc62200d6ba6ebc0b0339967cdd84669a Author: Mark Wielaard <mjw@redhat.com> Date: Wed May 12 14:31:59 2010 +0200 Add testcase for PR6954 ++ operations not triggering automatic global printing * testsuite/systemtap.base/global_end_var.stp: New test. * testsuite/systemtap.base/global_end.exp: Add xfail. commit 980759f3da625c5697440efc1bfda046c247866e Author: Mark Wielaard <mjw@redhat.com> Date: Wed May 12 14:16:26 2010 +0200 PR11590 Revert "PR6954: make ++/-- operation trigger automatic global printing" This reverts commit 982b88bd950381434a8325e550eff9df59f59dcd. Pretending that pre/postfix ++/-- operations only writes to its argument messes up the optimization of a variable that is assigned and then only used through post/prefix ++/--operators.
We should include other forms of RMW too, like: probe foo { var *= 2 } global var
*** Bug 11854 has been marked as a duplicate of this bug. ***
This seems to fix it: * staptree.h (varuse_collecting_visitor): Add readmodwrite. * staptree.cxx (varuse_collecting_visitor::visit_symbol): Set readmodwrite. * elaborate.cxx (add_global_var_display): Take into account readmodwrite. diff --git a/elaborate.cxx b/elaborate.cxx @@ -1357,4 +1357,5 @@ void add_global_var_display (systemtap_session& s) vardecl* l = s.globals[g]; - if (vut.read.find (l) != vut.read.end() - || vut.written.find (l) == vut.written.end()) + if ((vut.read.find (l) != vut.read.end() + || vut.written.find (l) == vut.written.end()) + && vut.readmodwrite.find (l) == vut.readmodwrite.end()) continue; diff --git a/staptree.cxx b/staptree.cxx index 37416e3..0056795 100644 --- a/staptree.cxx +++ b/staptree.cxx @@ -2049,3 +2049,3 @@ varuse_collecting_visitor::visit_symbol (symbol *e) - if (current_lvalue == e || current_lrvalue == e) + if (current_lvalue == e) { @@ -2054,2 +2054,7 @@ varuse_collecting_visitor::visit_symbol (symbol *e) } + else if (current_lrvalue == e) + { + written.insert (e->referent); + readmodwrite.insert (e->referent); + } if (current_lvalue != e || current_lrvalue == e) diff --git a/staptree.h b/staptree.h index 647aa0e..9680b55 100644 --- a/staptree.h +++ b/staptree.h @@ -836,2 +836,3 @@ struct varuse_collecting_visitor: public functioncall_traversing_visitor std::set<vardecl*> read; + std::set<vardecl*> readmodwrite; std::set<vardecl*> written;
I think we need an extension to the varuse_collecting_visitor that communicates whether the current expression is being evaluated for side-effects only, or whether its value is being used as a real rvalue. So, for a 'global g', (a) probe foo { println (g ++)) } (b) probe bar { g ++ } only (b) should result in g auto-printing. It doesn't relate exactly to whether the ++ was read-modify-write. Also remember stap -u (unoptimized mode) should probably work identically with the global auto-printing, which means that at translate time, we can't assume that the only remaining expression nodes are useful-or-sideeffecty.
Created attachment 4935 [details] initial patch
attachment doesn't answer #8 but answers the initial issue. Posted for future reference.
Created attachment 5819 [details] detect unread rw global vars
Attachment #11 [details] detects gs1 = gs1 + 2; gs2 += 2; ++ gs3; gs4 ++;
Hmm, didn't realize bugzilla would do that, I actually meant attachment 5819 [details]
It's a good start, but my guess is more & different code will be needed. lvalue_read is probably the right sort of conditional, but it needs to be protected against nesting - like last_lvalue etc. are. foo = (baz ++) & (noo = (zoo ++)) or such crazy stuff. Assignment and printing are not that special cases for using the results of an A++ type operation. Anything that reads the result from a read-modify-write is relevant, e.g. foo = bar + baz++; factorial (baz++);
Can unary expressions, that are components of the rhs of an assignment, be candidates for global display? Seems like they will always candidates for getting their value read. Statement expressions where the expression is a unary expression, seem like the primary candidates for global display. stmt := assgn | for | if | return | while | printf | expr assgn := operand assgnop expr for := for (expr; expr; expr) stmt if = if (expr) stmt [ else stmt ] return expr while (expr) stmt printf (operand[, operand, ...]) expr := terexpr | binexpr | unexpr1 | unexpr2 terexpr := cond ? expr : expr binexpr := operand binop operand unexpr1 := unop operand unexpr2 := operand unop operand := funcall | variable | numlit funcall := fn ([ arg1, arg2, ... ])
OK, the code still looks a bit unclear to me as to why it avoids the earlier optimizer problems (bug #11590), or whether it may create new ones. Perhaps some more comments & tests could help clarify. For example: the varuse_collecting_visitor::visit_symbol simply must produce correct results in the read/write sets, for the script to be translated correctly. It would be good to spell out a variety of before/after-patch cases as comment blocks in the code or in email. As to the (symbol*)current_lvalue construct, it would be much better not to rely on that. (There's no explanation why that's even used, and I'm worried it hides a problem.) Or if one absolutely must, at least type-check it at run time with something like assert(dynamic_cast<symbol*>(current_lvalue)); The test cases should include cases such as println(foo++) # no printing script_function(bar++) # no printing
11590 patch ++x x++ changes use of current_lvalue to instead use current_lrvalue (in visit_pre_crement, visit_post_crement) 6954 patch ++x x++ no change to current_lvalue and current_lrvalue instead adds current_last_lvalue to assist setting read/write sets The purpose of the following check: (current_lvalue == 0 || ((symbol*)current_lvalue)->referent != e->referent) is to check "is this the same as the lhs symbol?" to catch situations like: x += 1; x = x + 1; This is in reference to comment 5 and is indeed a bit baroque. Here are a few comparisons of read/written sets before and after the 6954 patch: gs1 = gs1 + 2; gs1 read set: gs0 gs1 written set: gs0_save gs1 temp - gs1 read set: gs0 gs1 + gs1 read set: gs0 gs1 written set: gs0_save gs1 temp gs2 += 2; - gs2 read set: gs0 gs1 gs2 + gs2 read set: gs0 gs2 written set: gs0_save gs1 gs2 temp ++ gs3; - gs3 read set: gs0 gs1 gs2 gs3 + gs3 read set: gs0 gs3 written set: gs0_save gs1 gs2 gs3 temp gs4 ++; - gs4 read set: gs0 gs1 gs2 gs3 gs4 + gs4 read set: gs0 gs4 written set: gs0_save gs1 gs2 gs3 gs4 temp gs5 ++; - gs5 read set: gs0 gs1 gs2 gs3 gs4 gs5 + gs5 read set: gs0 gs5 written set: gs0_save gs1 gs2 gs3 gs4 gs5 temp gs6 = gs6 + temp; - gs6 read set: gs0 gs1 gs2 gs3 gs4 gs5 + gs6 read set: gs0 gs6 written set: gs0_save gs1 gs2 gs3 gs4 gs5 gs6 temp - gs6 read set: gs0 gs1 gs2 gs3 gs4 gs5 gs6 + gs6 read set: gs0 gs6 written set: gs0_save gs1 gs2 gs3 gs4 gs5 gs6 temp - temp read set: gs0 gs1 gs2 gs3 gs4 gs5 gs6 temp + temp read set: gs0 temp temp written set: gs0_save gs1 gs2 gs3 gs4 gs5 gs6 temp printf ("gs7=%d ",fna(gs7)) - gs7 read set: gs0 gs1 gs2 gs3 gs4 gs5 gs6 gs7 temp + gs7 read set: gs0 gs7 temp gs7 written set: gs0_save gs1 gs2 gs3 gs4 gs5 gs6 temp - a read set: gs0 gs1 gs2 gs3 gs4 gs5 gs6 gs7 a temp + a read set: gs0 gs7 a temp a written set: gs0_save gs1 gs2 gs3 gs4 gs5 gs6 temp gtemp = (gs8++) & (gtemp = (gs9++)) - gtemp read set: gs0 gs1 gs2 gs3 gs4 gs5 gs6 gs7 a temp + gtemp read set: gs0 gs7 a temp gtemp written set: gs0_save gs1 gs2 gs3 gs4 gs5 gs6 gtemp temp - gs8 read set: gs0 gs1 gs2 gs3 gs4 gs5 gs6 gs7 gs8 a temp + gs8 read set: gs0 gs7 gs8 a temp gs8 written set: gs0_save gs1 gs2 gs3 gs4 gs5 gs6 gs8 gtemp temp - gtemp read set: gs0 gs1 gs2 gs3 gs4 gs5 gs6 gs7 gs8 a temp + gtemp read set: gs0 gs7 gs8 a temp gtemp written set: gs0_save gs1 gs2 gs3 gs4 gs5 gs6 gs8 gtemp temp - gs9 read set: gs0 gs1 gs2 gs3 gs4 gs5 gs6 gs7 gs8 gs9 a temp + gs9 read set: gs0 gs7 gs8 gs9 a temp gs9 written set: gs0_save gs1 gs2 gs3 gs4 gs5 gs6 gs8 gs9 gtemp temp ls0 = gtemp + gs10++; - ls0 read set: gs0 gs1 gs2 gs3 gs4 gs5 gs6 gs7 gs8 gs9 a temp + ls0 read set: gs0 gs7 gs8 gs9 a temp ls0 written set: gs0_save gs1 gs2 gs3 gs4 gs5 gs6 gs8 gs9 gtemp temp ls0 - gtemp read set: gs0 gs1 gs2 gs3 gs4 gs5 gs6 gs7 gs8 gs9 gtemp a temp + gtemp read set: gs0 gs7 gs8 gs9 gtemp a temp gtemp written set: gs0_save gs1 gs2 gs3 gs4 gs5 gs6 gs8 gs9 gtemp temp ls0 - gs10 read set: gs0 gs1 gs2 gs3 gs4 gs5 gs6 gs7 gs8 gs9 gs10 gtemp a temp + gs10 read set: gs0 gs7 gs8 gs9 gs10 gtemp a temp gs10 written set: gs0_save gs1 gs2 gs3 gs4 gs5 gs6 gs8 gs9 gs10 gtemp temp ls0 println(gs11++) - gs11 read set: gs0 gs1 gs2 gs3 gs4 gs5 gs6 gs7 gs8 gs9 gs10 gs11 gtemp a temp + gs11 read set: gs0 gs7 gs8 gs9 gs10 gs11 gtemp a temp gs11 written set: gs0_save gs1 gs2 gs3 gs4 gs5 gs6 gs8 gs9 gs10 gs11 gtemp temp ls0
commit: 9d8a6ba36b ++x x++ x+=N are now supported