Trying the following script: #!/usr/bin/stap # usage: task-histogram.stap process_name global hist probe process.mark("reactor_run_tasks_single_start") { ++hist[tid(), $arg1] } probe end { foreach ([tid, addr] in hist) { printf("%10d %8d 0x%x\n", hist[tid, addr], tid, addr) } } (trying to collect a histogram of tasks) With this command line: stap --dyninst -x $(pgrep -x httpd) ./debug/task-histogram.stap Crashes with WARNING: /usr/bin/stapdyn exited with signal: 11 (Segmentation fault) systemtap-4.1-1.fc30.x86_64
Hi, Avi, sorry for not noticing this earlier. Some questions to assist in the local reproduction of this problem: - arch: x86-64? - same script works in lkm (non-dyninst) mode? - tried stap -p4 --dyninst FOO.stp ; gdb -args stapbpf FOO.so so as to get a gdb backtrace at the crash site? - what level of traffic is the httpd process absorbing during this time? (thus: how much thread / child-process changes?) - tried targeting a program other than this httpd?
Sorry for noticing _your_ comment so late. I retested with systemtap-4.1-2.fc30.x86_64, and it appears to work. (x86_64, don't remember if I tried lkm, httpd has no forks/pthread_creates at all)
The performance impact is horrendous however, 5X slower (251k req/sec without the script, 47k with the script). Does dyninst rewrite the entire program or just the entry points to the probe?
And now I get segmentation faults again. #0 int_process::removeAllBreakpoints (this=0x55bf9c39eab8) at /usr/include/c++/9/bits/stl_tree.h:208 #1 0x00007f0c1737589f in linux_process::preTerminate (this=0x55bf9c39e820) at /usr/src/debug/dyninst-10.0.0-7.fc30.x86_64/dyninst-10.0.0/proccontrol/src/linux.C:1740 #2 0x00007f0c1733a4d1 in Dyninst::ProcControlAPI::ProcessSet::terminate (this=0x55bfb12f09e0) at /usr/src/debug/dyninst-10.0.0-7.fc30.x86_64/dyninst-10.0.0/proccontrol/src/procset.C:1644 #3 0x00007f0c172d95db in Dyninst::ProcControlAPI::Process::terminate (this=<optimized out>) at /usr/include/boost/smart_ptr/shared_ptr.hpp:732 #4 0x00007f0c17da7ab5 in PCProcess::terminateProcess (this=0x55bfa0f37cb0) at /usr/include/boost/smart_ptr/shared_ptr.hpp:732 #5 PCProcess::terminateProcess (this=0x55bfa0f37cb0) at /usr/src/debug/dyninst-10.0.0-7.fc30.x86_64/dyninst-10.0.0/dyninstAPI/src/dynProcess.C:1027 #6 0x00007f0c17db377c in PCProcess::attachProcess (progpath=..., pid=16581, analysisMode=BPatch_normalMode) at /usr/src/debug/dyninst-10.0.0-7.fc30.x86_64/dyninst-10.0.0/dyninstAPI/src/dynProcess.C:162 #7 0x00007f0c17cf269a in BPatch_process::BPatch_process(char const*, int, BPatch_hybridMode) () at /usr/src/debug/dyninst-10.0.0-7.fc30.x86_64/dyninst-10.0.0/dyninstAPI/src/BPatch_process.C:328 #8 0x00007f0c17ccf027 in BPatch::processAttach (this=<optimized out>, path=0x0, pid=16581, mode=BPatch_normalMode) at /usr/src/debug/dyninst-10.0.0-7.fc30.x86_64/dyninst-10.0.0/dyninstAPI/src/BPatch.C:1260 #9 0x000055bf9abb4519 in ?? () #10 0x000055bf9c362440 in ?? () #11 0x00007ffd51a63ec0 in ?? () #12 0x00000000000040c5 in probe_13397 () #13 0x0000000000000000 in ?? ()
I think the trigger for the crash is re-attaching to a process after detaching from it.
And the cause for the slowness is lock contention. With only two threads. Please please please add thread-local storage to the language.
(In reply to Avi Kivity from comment #6) > And the cause for the slowness is lock contention. With only two threads. > Please please please add thread-local storage to the language. It's more of a runtime issue than a language issue, but yeah. Surely there are some optimization opportunities in what we emit for: probe process.mark("reactor_run_tasks_single_start") { ++hist[tid(), $arg1] }
I imagine that if you notice that a key component is always tid() (except an in an end probe) then you can rewrite the global map as a thread local map with extra magic for the end probe. But it seems fragile, as soon as you violate one of the constraints even a tiny bit, it stops working with no feedback to the user about what went wrong. And when it stops working, it's likely to have a huge impact on the running workload.
Is there more information I can supply to help fix the segmentation fault?
Stan might be able to help with the dyninst segv up in comment #4. OTOH, there is a dyninst 10.1 build in stable updates, which would be worth retesting against.
Created attachment 12120 [details] dyninst static marker test
I'll try with httpd; meanwhile a synthetic looping test using static markers seems to work fine: stap --dyninst -x $(pgrep -x tstgetline.x) ./tstgetline.stp 3 29151 0x20a1260 with: dyninst-10.1.0-4.fc30.x86_64 systemtap-4.2-1.fc30.x86_64
| I think the trigger for the crash is re-attaching to a process after detaching from it. That sounds similar to bug 23513
The httpd in question is not Apache httpd. I can provide a binary (and source of course) if needed. Meanwhile I'm following the detach bug.
And please^19, do make it possible to attach probes to tracepoints that are hit with very high frequency.
> I can provide a binary (and source of course) Yes please, that would be helpful