Here's a situation where including the uprobes code as part of the SystemTap-generated module messes us up. Execute a program like this: #include <unistd.h> #include <stdio.h> int sleeper() /* set the retprobe here */ { sleep(1); return 1; } main() { int ret; for (;;){ ret = sleeper(); printf("sleeper returns %d\n",ret); } } and then probe it with a stap module such as this: probe begin { log("Probing...") } probe process($1).statement($2).absolute.return { log (pp()) } with a command such as # stap sleeper.stp <pid> <sleeper_vaddr> If you CTRL-C out of stap while sleeper() is running, you'll get an oops. That's because unregister_uretprobe(), called by the module's cleanup function, doesn't wait for the uretprobed function to return. (It could be a LONG wait, after all.) Instead, it leaves the uprobe_process (and utrace_attached_engine) in place until sleeper() returns and hits the breakpoint at the uretprobe trampoline; uprobes's report_signal callback should then clean up. (This is pretty much how kretprobes works, too.) Unfortunately, by that time, the uprobes code no longer exists -- it disappeared with the module -- so utrace calls a nonexistent callbck. We could work around this on i386 and x86_64 by remembering the location, on the stack, of the return address, and stuffing the real return address back into the stack as part of unregister_uretprobe(). (That's how kretprobes was originally implemented.) For the other architectures, though, this won't work, I'm told.
Would packaging this as a stacked module fix this? (It could be built & insmod'd separately, from sources still kept within the systemtap install tree (even its current location), then insmod'd and referred-to from systemtap modules.)
> We could work around this on i386 and x86_64 by remembering the location, on the > stack, of the return address, and stuffing the real return address back into the > stack as part of unregister_uretprobe(). (That's how kretprobes was originally > implemented.) For the other architectures, though, this won't work, I'm told. I investigated Jim's approach for the s390x (64-bit application only). It could possible to use Jim's approach however, I there are some special cases that my not be possible to handle. Here is a typical preamble/prolog on s390 (64-bit) -When the function is called the return address is in GR14. -The first instruction of the function pushes GR14 along with any other registers that need to be saved onto the stack. - <function body goes here.> -The prolog pops the return address off the stack in to any available general register. -Branch the to the address in the GR above. It should be a simple matter for uprobes to calculate the address on the stack where the return address will be saved. Therefor the original return address can be restored when removing the return probe. However, the are several exception cases we will need to handle: 1) The gcc may optimize out the saving of the RA because GR14 is never overwritten by the function body. We can handle this case by flagging that the RA is stored in GR14 not on the stack. When the return is removed we then place the original RA directly in GR14. 2) When removing the return probe there is a small window where the return address has already been popped from the stack. Uprobes will have no way of knowing this has occurred other than to search for the address of the trampoline in all the GRs. But this may be problematic. If a GR contains the address of the trampoline it could be coincidental or a left over from some other operation. I don't feel that case #2 can be handled safely.
(In reply to comment #1) > Would packaging this as a stacked module fix this? > (It could be built & insmod'd separately, from sources still kept > within the systemtap install tree (even its current location), then > insmod'd and referred-to from systemtap modules.) > [Here's the gist of my response from 9/30, which didn't make it into the bugzilla.] Yes, that would simplify things greatly for us. Uprobes-as-module has been working for 3 months or so. We also wouldn't need the uprobes_data hook. I'm working on that approach now.
Here's a proposed design for modifying stap to optionally build and use a uprobes module whose source resides in runtime/uprobes. Currently, runtime/uprobes contains code that is included directly into the stap-generated module. This suffers from problems, as reported in this bugzilla. It will be modified to be built as a kernel module that staprun can insert, as needed, before inserting the stap-generated module. If stap emits calls to the uprobes API, it will also attempt to build uprobes as a module, using the source and (new) Makefile in runtime/uprobes. Stap will also pass the -u ("uprobes") flag to staprun. To account for the possibility of uprobes being part of the kernel build (either as a module or as a built-in component), stap will generate code that looks something like: #if defined(CONFIG_UPROBES) || defined(CONFIG_UPROBES_MODULE) #include <linux/uprobes.h> #else #include "uprobes/uprobes.h" #endif When passed the -u flag, staprun will check to see whether (say) unregister_uprobe already shows up in /proc/kallsyms. If it doesn't, staprun will try the following before loading the SystemTap-generated module: 1. modprobe uprobes 2. Load uprobes.ko from runtime/uprobes. Seems like we have to do #1 first because of the way the above #includes work. If the user wants to use the stap-runtime version of uprobes.ko instead of the version in /lib/modules, then I guess he/she should copy the stap-runtime version into /lib/modules. Questions: - Should we avoid the make if (say) unregister_uprobe already shows up in /proc/kallsyms? What would be the implications for a "stap now, staprun later" scenario? - Should we avoid using /lib/modules/.../uprobes.ko, and instead use only the stap runtime version? If so, we'd take the "defined(CONFIG_UPROBES_MODULE)" clause out of the above code; and we'd skip the modprobe.
Created attachment 2035 [details] Here's a patch that implements the fix described in comment #4. I'd appreciate it if stap/staprun maintainers could take a look at this before I check it in. It's been tested both on a kernel that include uprobes and on one that doesn't. (Your kernel still needs utrace and the usual exports -- access_process_vm needs to be added to RHEL5.1's exports, and __put_task_struct needs to be added for kernel.org kernels.)
Checked in the changes from Comment #5 today.