This is the mail archive of the
systemtap@sourceware.org
mailing list for the systemtap project.
Re: Updated patch adding line number enumeration support to statement probe
- From: Jonathan Lebon <jlebon at redhat dot com>
- To: BR Chrisman <brchrisman at gmail dot com>
- Cc: systemtap at sourceware dot org
- Date: Tue, 3 Jun 2014 14:44:20 -0400 (EDT)
- Subject: Re: Updated patch adding line number enumeration support to statement probe
- Authentication-results: sourceware.org; auth=none
- References: <CAN4=B2kFBfT3WKsyaVshcco-2=7dikggEGt0O7X=e+J3E_qGAw at mail dot gmail dot com> <1844543191 dot 19591707 dot 1401808203426 dot JavaMail dot zimbra at redhat dot com> <CAN4=B2=7Nofy4cGu+Vo2+yG4PBOFfrHRw_6570DjvTVGWAby7Q at mail dot gmail dot com>
Hi Brian,
> I'll verify my mailing before sending up the next patch. I've had
> issues with this before, my apologies.
No worries.
> I cleaned up all the RANGE stuff and will include in another patch.
Great!
> > You seed the linenos vector with two '0' elements, but then never add
> > the parsed linenos for ABSOLUTE or RELATIVE linenos. So stap thinks that
> > the user entered lineno 0. I suspect that's what's causing all the failures
> > in statement.exp.
>
> The seed values were intended to mimic the removed line:
> linenos[0] = linenos[1] = 0;
>
> I tried testing with those seeds removed, which failed as well, but
> I'm confused about how/where ABSOLUTE/RELATIVE specifications are
> parsed.
The lines that were removed which took care of this were these:
- dash_pos = spec.find('-', line_pos + 1);
- if (dash_pos == string::npos)
- linenos[0] = linenos[1] = lex_cast<int>(spec.substr(line_pos + 1));
> The two statement.exp test cases which are failing are:
> FAIL: statement (ENUMERATED and RANGE - single func - expected 3 probes, got
> 0)
> FAIL: statement (ENUMERATED and RANGE - wild func - expected 3 probes, got 0)
> ...
> === Summary ===
> # of expected passes 32
> # of unexpected failures 2
Strange, I had the following failures when applying your patch:
Running /home/yyz/jlebon/codebase/systemtap/systemtap/testsuite/systemtap.base/statement.exp ...
FAIL: statement (ABSOLUTE - single func - expected 1 probes, got 0)
FAIL: statement (ABSOLUTE - wild func - expected 1 probes, got 0)
FAIL: statement (RELATIVE - single func - expected 1 probes, got 0)
FAIL: statement (RELATIVE - wild func - expected 2 probes, got 1)
FAIL: statement (ABSOLUTE - error for no lines - single func - no semantic error)
> > For parsing the spec, I think it would be easier to tokenize on commas,
> > and then check the resulting tokens one by one for either a single lineno
> > or a range (maybe even factor that part out into its own function). There's
> > tokenize() in util.cxx which is useful for this.
>
> I noticed that I misinterpreted what the code was doing with the range
> before.
> This patch simply translates it into an enumeration.
> Even using binary_search on an enumeration, a search of a large range
> would be inefficient.
So what I meant here was that it might be easier to parse the a,b-c,d,e,...
by doing something like this (pseudo-code):
vector<string> tokens;
tokenize("a,b-c,d,e", tokens, ","); // available from util.h
for (token in tokens) {
if token is single number
add single number to linenos vector
if token is a range
for each number in range
add number to linenos vector
}
Regarding binary searching, I think that's OK. It's still pretty
fast. :)
> I'll have to reformulate this. I'm not sure it helps to go as far as
> interval tree stuff, but tracking ranges properly, rather than
> smashing them into enumerations, makes sense to me.
> Yeah, it seems like we need to check against a series of items which
> could be either individual line numbers or ranges. This will also
> solve the output case as we can just parrot out the items.
> I kind-of see where this can change. Enumerating the range explicitly
> was my gut response, but it could be a surprise performance reduction
> for a large range in a large file of source.
> I'll try to get this all cleaned up in the next day or two.
I think it would be easier to keep it simple and let the vector just hold
the individual linenos (like you originally designed). Since we'll need to
sort the final vector after all the parsing is done anyway (for later
binary searching if necessary), we can easily print it out in the
"a,b-c,d,e,..." form by going through the vector and detecting jumps >1.
> thanks for your feedback,
> Brian
Looking forward to your next patches!
Cheers,
Jonathan