This is the mail archive of the
systemtap@sourceware.org
mailing list for the systemtap project.
[Bug uprobes/5509] uprobe booster thoughts
- From: "jkenisto at us dot ibm dot com" <sourceware-bugzilla at sourceware dot org>
- To: systemtap at sources dot redhat dot com
- Date: 13 Mar 2009 18:23:40 -0000
- Subject: [Bug uprobes/5509] uprobe booster thoughts
- References: <20071218172927.5509.jkenisto@us.ibm.com>
- Reply-to: sourceware-bugzilla at sourceware dot org
------- Additional Comments From jkenisto at us dot ibm dot com 2009-03-13 18:23 -------
(In reply to comment #1)
...
> > The above instruction sequence takes 14 bytes: 6 bytes for the jmpq
> > (always ff 25 00 00 00 00) and 8 bytes for the address. For x86_64,
> > MAX_UINSN_BYTES=16, which doesn't leave much room for the actual
> > instruction copy. We seem to have the following choices:
> > a) Boost only 1-byte and 2-byte instructions. (Ick)
> > b) Make MAX_UINSN_BYTES larger.
>
> How larger would make it feasible? Would 24 from the existing 16 bytes be good
enuf?
Yes. Looking at a variety of 64-bit a.outs, it appears that ~99% of
instructions are 10 bytes or less. A 24-byte slot would leave room for a
10-byte instruction + the 14-byte jump.
>
> > c) Allocate 2 SSOL slots for a boostable instruction.
> > d) Allocate some big (boostable) slots and some little ones.
> >
...
>
> Now that we are looking at instruction analysis layer, it would be possible to
> relook at option d. i.e
> A. Big slots for private and boostable instructions with instruction size
> greater than 2 bytes.
> B. small slots for public or boostable instructions with instruction size less
> than 2 bytes.
Typically, 15-25% are 1-2 bytes (but that may be high due to stuff like nop
padding).
>
> How much additional complexity would this add?
Hard to say. You could wind up with something more complicated than malloc if
you get too cute, but having just 2 slot pools (big/private and small/public)
wouldn't be much more complicated than what I prototyped for #5275.
> Would it justify the performance
> gain that we get?
Having multiple slot sizes would save memory (i.e., the size of the SSOL vma),
but I don't think it would otherwise help performance. As previously mentioned,
the performance gain from boosting should be close to 50%.
>
> Though it would not solve 9826 completely, the solution for this problem could
> act as a workaround for all cases where we can boost the instruction.
Well, it would reduce our exposure by reducing single-stepping. But I think
that fixing 9826 should be easier than implementing boosting.
--
http://sourceware.org/bugzilla/show_bug.cgi?id=5509
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.