Bug 23598 - The function duplication removal optimizer might lead to confusing source locations in runtime error messages
Summary: The function duplication removal optimizer might lead to confusing source loc...
Status: UNCONFIRMED
Alias: None
Product: systemtap
Classification: Unclassified
Component: translator (show other bugs)
Version: unspecified
: P2 normal
Target Milestone: ---
Assignee: Unassigned
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2018-08-30 22:49 UTC by agentzh
Modified: 2018-08-30 22:49 UTC (History)
0 users

See Also:
Host:
Target:
Build:
Last reconfirmed:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description agentzh 2018-08-30 22:49:05 UTC
Consider the following minimal example derived from real world stap tools:

```
function foo(p) {
    printf("foo: %#x\n", @cast(p, "unsigned char", "/path/to/a.out")[0]);
}

function bar(p) {
    printf("bar: %#x\n", @cast(p, "unsigned char", "/path/to/a.out")[0]);
}

probe process.function("main") {
    bar(&@var("a"));
    foo(0);
    foo(&@var("a"));
    exit();
}
```

And the target C program is like this:

```
unsigned char a = 7;
int main(void) {
    return 0;
}
```

Running the stap script like this:

```
$ stap -c ./a.out a.stp
ERROR: read fault [man error::fault] at 0x0 near operator '@cast' at test.stp:6:26
WARNING: Number of errors: 1, skipped probes: 0
WARNING: /opt/stap-old/bin/staprun exited with status: 1
Pass 5: run failed.  [man error::pass5]
```

Note the line number is 6 inside function bar() while the real read fault happens in function foo(). The function foo() and bar() might look completely different and the only reason for the line number messing-up is because they both have the same `@cast(...)` expression.

Disabling the optimization altogether "fixes" the issue but who doesn't want to enable optimizations online? But online read faults thus will become horribly undebuggable.

One workaround is to skip synthetic function definitions in the semantic_pass_opt6() C++ function in elaborate.cxx:

```
diff --git a/elaborate.cxx b/elaborate.cxx
index 2c08568ac..b7a0fcae2 100644
--- a/elaborate.cxx
+++ b/elaborate.cxx
@@ -5201,6 +5201,9 @@ void semantic_pass_opt6 (systemtap_session& s, bool& relaxed_p)
   for (map<string,functiondecl*>::iterator it = s.functions.begin(); it != s.functions.end(); it++)
     {
       functiondecl *fd = it->second;
+      if (fd->synthetic)
+        continue;
+
       string functionsig = get_functionsig(fd);

       if (functionsig_map.count(functionsig) == 0)
```

But I agree this workaround is more overkill than desired and we might want to avoid many duplications in synthetic functions.

So I'm just recording this problem for now. It has wasted me a couple of hours trying to pinpoint a read fault in a completely wrong place while debugging a very complicated stap tool. Alas.