This is the mail archive of the
systemtap@sourceware.org
mailing list for the systemtap project.
[Bug runtime/19799] New: deleting from array of aggregate unreliable
- From: "raeburn at permabit dot com" <sourceware-bugzilla at sourceware dot org>
- To: systemtap at sourceware dot org
- Date: Wed, 09 Mar 2016 17:21:08 +0000
- Subject: [Bug runtime/19799] New: deleting from array of aggregate unreliable
- Auto-submitted: auto-generated
https://sourceware.org/bugzilla/show_bug.cgi?id=19799
Bug ID: 19799
Summary: deleting from array of aggregate unreliable
Product: systemtap
Version: unspecified
Status: NEW
Severity: normal
Priority: P2
Component: runtime
Assignee: systemtap at sourceware dot org
Reporter: raeburn at permabit dot com
Target Milestone: ---
I've got a SystemTap script that updates entries in an array of aggregates, and
occasionally deletes entries, but the deletion doesn't reliably seem to work.
I'm using a function probe that collects some timing data and updates stats in
an array. Periodically (with a "timer.ms" probe) we pick a range of indices and
print out the values accumulated so far, and (try to) clear them.
global stats[500000];
probe module(...).function(...) {
...stats[a,1] <<< value1;...stats[a,2] <<< value2;...etc...
}
probe timer.ms(NNN) {
for (...) {
printf(...stats[x,y]...);
delete stats[x,y];
}
}
As I understand it, the delete should get rid of the array entry, effectively
resetting the counter for the key-pair to zero. What I'm seeing instead is that
often the array entry doesn't get deleted; if I use:
delete stats[thisIndex,1];
if ([thisIndex,1] in stats) {
printf("eek! [%d,1] in stats after deletion??\n",
thisIndex);
}
then the error message fires pretty often, but not always, with my script. And
the values output are clearly continuing to accumulate data from one report to
the next.
This happens with "version 2.7/0.161, rpm 2.7-2.el6" on RHEL6, version 2.9 from
the web site, and git rev d3aa622.
Looking at pmap-gen.c in git (which could use a few more comments maybe?), it
looks to me like the data is stored in per-CPU maps, and collected from all of
them when read out, but _stp_pmap_del appears to operate only on the per-CPU
map for the current CPU.
A quick experiment putting a for_each_possible_cpu loop into _stp_pmap_del
seems to fix the problem for me, on initial testing; the error message above
doesn't fire, and the counters reported are often smaller than on the previous
iteration. I won't bother sending my patch, as it seems to be functional but
isn't very good -- it recomputes the hash value for every per-CPU map, and I
overlooked the aggregate map, but I assume the entry should probably be removed
there too.
--
You are receiving this mail because:
You are the assignee for the bug.