This is the mail archive of the
systemtap@sourceware.org
mailing list for the systemtap project.
Re: uprobes and empty functions
- From: Josh Stone <jistone at redhat dot com>
- To: David Smith <dsmith at redhat dot com>
- Cc: Srikar Dronamraju <srikar at linux dot vnet dot ibm dot com>, Systemtap List <systemtap at sources dot redhat dot com>
- Date: Fri, 29 Oct 2010 13:45:34 -0700
- Subject: Re: uprobes and empty functions
- References: <4CC9F377.7050606@redhat.com>
I reproduced the segfault with this:
$ cat empty.c
void empty() { asm("rep\nret"); }
int main() { empty(); return 0; }
$ gcc -g -O1 empty.c -o empty
(-O1 or greater is needed, else my manual ret breaks the stack frame.)
$ stap -ve 'probe process("./empty").function("empty"){ println(pp()) }'
(Leave this running, then in another window...)
$ ./empty
Segmentation fault (core dumped)
I wrote a uprobes patch, attached, which deals with "rep ret" by
treating it exactly like ret in the x86 uprobe_post_ssout (defined in
four places, sheesh...)
I only matched "f3 c3", but I'm not sure if we need bother with other
rep/ret variants for this special case. Or further, we might even want
to be prepared for any supported prefix on any of the these "special"
opcodes in post_ssout, as validate_insn is pretty lenient in that.
But for now, this minimal fix seems to do the trick for my test.
Josh
PS- For the curious, Mark found a reference to why "rep ret" is used:
http://sourceware.org/ml/libc-alpha/2004-12/msg00022.html
commit 14702b51b95df2fdaa2a08d8d74cc69c6f1d5578
Author: Josh Stone <jistone@redhat.com>
Date: Fri Oct 29 13:23:11 2010 -0700
uprobes: Fix post_ssout handling of "rep ret"
That odd sequence is apparently used to coerce better behavior from the
branch predictor on AMD K8. GCC does this, so we need to be prepared to
deal with it. So on all the x86 uprobes, uprobe_post_ssout now has a
special case to treat "rep ret" just like a normal "ret" for ip fixup.
diff --git a/runtime/uprobes/uprobes_i386.c b/runtime/uprobes/uprobes_i386.c
index 589393d..bf98225 100644
--- a/runtime/uprobes/uprobes_i386.c
+++ b/runtime/uprobes/uprobes_i386.c
@@ -224,6 +224,13 @@ void uprobe_post_ssout(struct uprobe_task *utask, struct uprobe_probept *ppt,
insn++;
switch (insn[0]) {
+ case 0xf3:
+ if (insn[1] != 0xc3)
+ break;
+ /*
+ * "rep ret" is an AMD kludge that's used by GCC,
+ * so we need to treat it like a normal ret.
+ */
case 0xc3: /* ret/lret */
case 0xcb:
case 0xc2:
diff --git a/runtime/uprobes/uprobes_x86.c b/runtime/uprobes/uprobes_x86.c
index 9e0bc11..956788b 100644
--- a/runtime/uprobes/uprobes_x86.c
+++ b/runtime/uprobes/uprobes_x86.c
@@ -634,6 +634,13 @@ void uprobe_post_ssout(struct uprobe_task *utask, struct uprobe_probept *ppt,
*/
switch (*insn) {
+ case 0xf3:
+ if (insn[1] != 0xc3)
+ break;
+ /*
+ * "rep ret" is an AMD kludge that's used by GCC,
+ * so we need to treat it like a normal ret.
+ */
case 0xc3: /* ret/lret */
case 0xcb:
case 0xc2:
diff --git a/runtime/uprobes/uprobes_x86_64.c b/runtime/uprobes/uprobes_x86_64.c
index 3ad40ea..1234a98 100644
--- a/runtime/uprobes/uprobes_x86_64.c
+++ b/runtime/uprobes/uprobes_x86_64.c
@@ -623,6 +623,13 @@ void uprobe_post_ssout(struct uprobe_task *utask, struct uprobe_probept *ppt,
*/
switch (*insn) {
+ case 0xf3:
+ if (insn[1] != 0xc3)
+ break;
+ /*
+ * "rep ret" is an AMD kludge that's used by GCC,
+ * so we need to treat it like a normal ret.
+ */
case 0xc3: /* ret/lret */
case 0xcb:
case 0xc2:
diff --git a/runtime/uprobes2/uprobes_x86.c b/runtime/uprobes2/uprobes_x86.c
index 02ec76b..21cbaaf 100644
--- a/runtime/uprobes2/uprobes_x86.c
+++ b/runtime/uprobes2/uprobes_x86.c
@@ -639,6 +639,13 @@ void uprobe_post_ssout(struct uprobe_task *utask, struct uprobe_probept *ppt,
*/
switch (*insn) {
+ case 0xf3:
+ if (insn[1] != 0xc3)
+ break;
+ /*
+ * "rep ret" is an AMD kludge that's used by GCC,
+ * so we need to treat it like a normal ret.
+ */
case 0xc3: /* ret/lret */
case 0xcb:
case 0xc2: