10466 – stap -L misrepresents variables available to multi-probe aliases

Bug 10466 - stap -L misrepresents variables available to multi-probe aliases

Summary: stap -L misrepresents variables available to multi-probe aliases

Status:	RESOLVED FIXED

Alias:	None

Product:	systemtap
Classification:	Unclassified
Component:	tapsets (show other bugs)
Version:	unspecified

Importance:	P2 normal
Target Milestone:	---
Assignee:	Unassigned

URL:
Keywords:

Depends on:
Blocks:

Reported:	2009-07-31 01:04 UTC by Josh Stone
Modified:	2009-11-04 19:21 UTC (History)
CC List:	0 users

See Also:
Host:
Target:
Build:
Last reconfirmed:

Attachments
patch (743 bytes, patch) 2009-10-29 08:53 UTC, Wenji Huang	Details \| Diff
Updated patch (2.35 KB, patch) 2009-11-03 05:37 UTC, Wenji Huang	Details \| Diff
Show Obsolete (1) View All Add an attachment (proposed patch, testcase, etc.)

Note You need to log in before you can comment on or make changes to this bug.

Description Josh Stone 2009-07-31 01:04:25 UTC

The listing mode attempts to clean up the output by only printing probe points
once, even if that name is an alias for several probe points underneath. 
However, -L only considers the variables from the first underlying probe point,
which is an issue if the multiple probe points don't have the same available
variables.

Usually the tapset writers try to define the same local variables for each
branch, but the recent addition of dwarf variables to -L exposes the difference
in the raw probe points.

$ stap -L signal.send
signal.send name:string shared:long sig:long task:long sinfo:long
send2queue:long sig_name:string sig_pid:long pid_name:string si_code:string
$sig:int $info:struct siginfo* $t:struct task_struct* $group:int $pending:struct
sigpending* $q:struct sigqueue*

Some of the $vars are only present in one part of signal.send, so they're not
actually usable in a signal.send probe body.

Rather than showing the variables of just one branch, -L should show the
set-intersection of each branch, so that every variable listed is actually useful.

Comment 1 Wenji Huang 2009-10-29 08:53:42 UTC

Created attachment 4336 [details]
patch

It's difficult to discriminate between multi probes like signal.* and one name
with several expanded probe alias like signal.send. Moreover the variable list
is in string stream, to make intersection is a little arduous.

Maybe the following way can be a good choice.

$ ./stap -L signal.send
signal.send /* kernel.function("send_signal@kernel/signal.c:816") */
name:string shared:long sig:long task:long sinfo:long send2queue:long
sig_name:string sig_pid:long pid_name:string si_code:string $sig:int
$info:struct siginfo* $t:struct task_struct* $group:int $pending:struct
sigpending*

signal.send /* kernel.function("send_sigqueue@kernel/signal.c:1309") */ 
name:string task:long sig:long sinfo:long shared:long send2queue:long
sig_name:string sig_pid:long pid_name:string si_code:string $q:struct sigqueue*
$t:struct task_struct* $group:int $sig:int $flags:long unsigned int

Comment 2 Josh Stone 2009-10-29 23:29:56 UTC

(In reply to comment #1)
> Created an attachment (id=4336)
> patch
> 
> It's difficult to discriminate between multi probes like signal.* and one name
> with several expanded probe alias like signal.send. Moreover the variable list
> is in string stream, to make intersection is a little arduous.

It doesn't have to stay as a string stream though.  The probes could instead
keep a set<string> of "name:type", to be reconciled at the higher level.

> Maybe [listing expansions separately] can be a good choice.

I don't think it's right to make the user figure out the common variables...

Comment 3 Wenji Huang 2009-11-03 05:37:06 UTC

Created attachment 4353 [details]
Updated patch

As jistone suggested, keep the variables/arguments as set<string>
of "name:type" and make set-intersection before printing.

$ ./stap -L signal.send
signal.send name:string pid_name:string send2queue:long shared:long
si_code:string sig:long sig_name:string sig_pid:long sinfo:long task:long
$group:int $sig:int $t:struct task_struct*

Now, only those items existing in all branch will be printed.

Comment 4 Josh Stone 2009-11-03 19:42:30 UTC

(In reply to comment #3)
> Created an attachment (id=4353)
> Updated patch

Great, that's exactly the output I was hoping for!  I have some comments for
cleaning up the code, and then please go ahead and commit this.

> map<string,set<unsigned>&> probe_list;

The reference value here seems dubious, and I also wonder why not just keep the
probe pointer directly instead of indexes?

  map<string, set<derived_probe*> > probe_list;

You can also take advantage that the default set constructor is an empty set, so
this:

> if (probe_list.find (pp) == probe_list.end())
>   {
>     set<unsigned> *probe_id = new set<unsigned>();
>     (*probe_id).insert(i);
>     probe_list.insert(pair<string,set<unsigned>&>(pp,*probe_id));
>   }
> else
>   probe_list.find(pp)->second.insert(i);

becomes:

  probe_list[pp].insert(p);

This same idiom will work for updating the counts in var_list and arg_list.

  var_list[tmps.str()]++;
...
  arg_list[*ia]++;

> set<string> *arg_set = new set<string>();
> p->getargs(*arg_set);

There's no reason for this to be heap allocated, which you're then leaking. 
Just create it directly:

  set<string> arg_set;
  p->getargs(arg_set);

>         var_list.clear();
>         arg_list.clear();
>       }
>     o << endl;
>   }
> probe_list.clear();

These clear() calls are redundant, as the maps are already going out of scope
there and will be deconstructed.

> dwarf_derived_probe::getargs(std::set<std::string> &arg_set) const
> {
>   for (set<string>::iterator it = args.begin(); it != args.end(); it++)
>     arg_set.insert(*it);
> }

Note that set::insert can take iterators:

  arg_set.insert(args.begin(), args.end());


Thanks!

Comment 5 Wenji Huang 2009-11-04 03:09:46 UTC

Thanks for the comments.

commit	c39cdd5565f718302057242bbfe50e71b69c4f4d

Comment 6 Josh Stone 2009-11-04 19:21:25 UTC

Fix confirmed.