Summary: | Dyninst runtime kills the target process when attaching to the target process for a second time | ||
---|---|---|---|
Product: | systemtap | Reporter: | agentzh <agentzh> |
Component: | dyninst | Assignee: | Unassigned <systemtap> |
Status: | ASSIGNED --- | ||
Severity: | normal | CC: | avi, fche, scox |
Priority: | P2 | ||
Version: | unspecified | ||
Target Milestone: | --- | ||
Host: | Target: | ||
Build: | Last reconfirmed: | 2019-04-10 00:00:00 | |
Attachments: | stap-report output on the same box |
The second dyninst-based probe's failure is expected: there can be only one concurrent "mutator" for a given target process. It is not right that the process should be killed though, that sounds like a bug. But the first `stap` command run already exits before the second one starts (please note the `exit()` function call in the probe handler). So there is still only *one* concurrent mutator here. I notice if the first dyninst run is via stap: % stap -e 'probe process("/lib64/libc.so.6").function("usleep") { println("hi") exit() }' -x `pgrep -f tst.x` --dyninst -k -vv and if subsequent runs were directly via stapdyn then it worked okay: % stapdyn -v -x 8955 /tmp/staprclqlD/stap_8970.so stapdyn: found target "/usr/lib64/libc-2.28.so" in pid 8955, inserting 1 probes hi % stapdyn -v -x 8955 /tmp/staprclqlD/stap_8970.so stapdyn: found target "/usr/lib64/libc-2.28.so" in pid 8955, inserting 1 probes hi Running with DYNINST_DEBUG_STARTUP=1 it seems the problem is dyninst fails to parse the stap so (not sure why) and intentionally terminates the process. [140306531820032]image.C[1461]: opening file /tmp/stapSmTcdi/stap_2ecef162cf9d3c769f9c52398beab57b_1423.so [140306531820032]image.C[1226]: created image [140306531820032]image.C[1229]: processing shared object [140306531820032]image.C[1247]: error in processing, deleting image and returning [140306531820032]mapped_object.C[157]: failed to parseImage [140306531820032]dynProcess.C[618]: failed to create mapped object for library /tmp/stapSmTcdi/stap_2ecef162cf9d3c769f9c52398beab57b_1423.so [140306531820032]dynProcess.C[361]: bootstrap failed while creating mapped objects Failed to bootstrap process 9255: terminating... The problem seems intermittant but I do see it and the culprit is that the stap shared object is not always detached when stapdyn completes. (gdb session after failing stapdyn) (gdb) info sharedlibrary From To Syms Read Shared Object Library 0x00007f3bd7f64670 0x00007f3bd80afc3f Yes (*) /lib64/libc.so.6 0x00007f3bd8139110 0x00007f3bd81582b4 Yes (*) /lib64/ld-linux-x86-64.so.2 0x00007f3bd6eb2320 0x00007f3bd6eb4845 Yes (*) /usr/lib64/dyninst/libdyninstAPI_RT.so 0x00007f3bd6eab270 0x00007f3bd6eac039 Yes (*) /lib64/libdl.so.2 No /tmp/stapUaA2KQ/stap_4e52f2023ce6edaabf143f6d5214b684_1361.so 0x00007f3bd8124710 0x00007f3bd8127a80 Yes (*) /lib64/librt.so.1 0x00007f3bd6e60b50 0x00007f3bd6e6f025 Yes (*) /lib64/libpthread.so.0 (*): Shared library is missing debugging information. So whenever stapdyn subsequently runs, dyninst tries to load that nonexistant library: /tmp/stapUaA2KQ/stap_4e52f2023ce6edaabf143f6d5214b684_1361.so |
Created attachment 11178 [details] stap-report output on the same box Consider the following simple forever-running C program: ```C #include <unistd.h> int main(void) { while (1) { usleep (1000); } return 0; } ``` Compile it and generate `./a.out`: ``` gcc -g -Wall a.c ``` Then start it in one terminal (let's call it terminal A): ``` $ ./a.out ``` It hangs as expected. Then try running the stap dyninst command like below in another terminal (let's say, terminal B): ``` $ /opt/stap/bin/stap -e 'probe process("/lib64/libc.so.6").function("usleep") { println("hi") exit() }' -x `pgrep -f './a.out'` --dyninst hi ``` It works as expected. And in terminal A, the `./a.out` program is still running. So far so good. But now as soon as we run the `stap` command above again, we got the following error on terminal B: ``` $ /opt/stap/bin/stap -e 'probe process("/lib64/libc.so.6").function("usleep") { println("hi") exit() }' -x `pgrep -f './a.out'` --dyninst --FATAL-- #68: Dyninst was unable to attach to the specified process --FATAL-- #68: BPatch.C[1047]: no process 40758 defined in procsByPid stapdyn: ERROR: Couldn't attach to the target process WARNING: /opt/stap/bin/stapdyn exited with status: 1 Pass 5: run failed. [man error::pass5] ``` And in terminal A, the `./a.out` process is already killed too: ``` agentzh@work2 ~ $ ./a.out Killed ``` This can also be reproduced by a `probe begin` one-liner: ``` agentzh@work2 ~ $ /opt/stap/bin/stap -e 'probe begin { println("hi") exit() }' -x `pgrep -f './a.out'` --dyninst hi agentzh@work2 ~ $ /opt/stap/bin/stap -e 'probe begin { println("hi") exit() }' -x `pgrep -f './a.out'` --dyninst --FATAL-- #68: Dyninst was unable to attach to the specified process --FATAL-- #68: BPatch.C[1047]: no process 40816 defined in procsByPid stapdyn: ERROR: Couldn't attach to the target process WARNING: /opt/stap/bin/stapdyn exited with status: 1 Pass 5: run failed. [man error::pass5] ``` I'm using the 3.3 release tag of systemtap on Fedora 27 x86_64: ``` $ /opt/stap/bin/stap --version Systemtap translator/driver (version 3.3/0.173, commit release-3.3-0-g48867d1cface) Copyright (C) 2005-2018 Red Hat, Inc. and others This is free software; see the source for copying conditions. tested kernel versions: 2.6.18 ... 4.18-rc0 enabled features: BOOST_STRING_REF DYNINST BPF NLS NSS ``` Also attached the stap-report output for the same box. Am I doing anything wrong here?