This is the mail archive of the
mailing list for the systemtap project.
Re: [PATCH] Linux Kernel Markers
- From: Martin Bligh <mbligh at google dot com>
- To: karim at opersys dot com
- Cc: "Frank Ch. Eigler" <fche at redhat dot com>, Masami Hiramatsu <masami dot hiramatsu dot pt at hitachi dot com>, prasanna at in dot ibm dot com, Andrew Morton <akpm at osdl dot org>, Ingo Molnar <mingo at elte dot hu>, Mathieu Desnoyers <mathieu dot desnoyers at polymtl dot ca>, Paul Mundt <lethal at linux-sh dot org>, linux-kernel <linux-kernel at vger dot kernel dot org>, Jes Sorensen <jes at sgi dot com>, Tom Zanussi <zanussi at us dot ibm dot com>, Richard J Moore <richardj_moore at uk dot ibm dot com>, Michel Dagenais <michel dot dagenais at polymtl dot ca>, Christoph Hellwig <hch at infradead dot org>, Greg Kroah-Hartman <gregkh at suse dot de>, Thomas Gleixner <tglx at linutronix dot de>, William Cohen <wcohen at redhat dot com>, ltt-dev at shafik dot org, systemtap at sources dot redhat dot com, Alan Cox <alan at lxorguk dot ukuu dot org dot uk>
- Date: Wed, 20 Sep 2006 12:22:02 -0700
- Subject: Re: [PATCH] Linux Kernel Markers
- Domainkey-signature: a=rsa-sha1; s=beta; d=google.com; c=nofws; q=dns; h=received:message-id:date:from:user-agent: x-accept-language:mime-version:to:cc:subject:references:in-reply-to: content-type:content-transfer-encoding; b=hk4UswgLsLqANX038NTgQ10e6kCBVlDZF0vGO0KjDDkOnjyzUcQU37AIIJsJa4Ujo wRC1lIwrqi1844aTHbptQ==
- References: <4510151B.email@example.com> <firstname.lastname@example.org> <45101DBA.email@example.com> <20060919063821.GB23836@in.ibm.com> <firstname.lastname@example.org> <20060919070516.GD23836@in.ibm.com> <451030A6.email@example.com> <45105B5E.firstname.lastname@example.org> <451141B1.email@example.com> <451178B0.firstname.lastname@example.org> <20060920180808.GI18646@redhat.com> <451186F2.email@example.com> <45118D63.firstname.lastname@example.org>
Karim Yaghmour wrote:
Martin Bligh wrote:
It's looking to me like it might still need djprobes to implement, in
order to get the atomic and safe switchover from the original function
into the traced one. All rather sad, but seems to be true from all the
CPU errata, etc. If anyone can see a way round that, I'd love to hear
But we don't need to fight the errata, there are fortunately solutions
that take care of it where it does exist (x86: djprobes/kprobes.)
What's more interesting, though, is that the method as it is proposed
at this stage *seems* to be easily portable to other archs. And where
such binary trickery is difficult to pull off, nothing precludes
having a universally "portable" mechanism including something akin to
switching between instrumented vs. normal function at function entry.
Even such conditional ifs can be optimized by the CPU nowadays.
The picture is, nevertheless, very bright at the moment (I think).
Just have a 5byte filler at function entry such as Hiramatsu-san
suggested, and use djprobes to fork to instrumented function. The
unconditional jump in the filler will most likely be utterly
unmeasurable, and benchmarks should confirm this.
On x86: use 5byte filler and djprobes.
On "sane" archs: use filler and override as explained earlier.
Elsewhere: use standard "if" or function pointer at function entry.
Do we even need the filler padding? I thought we could insert kprobes
at the beginning of any function without that ... it was only a
requirement for mid-function (sometimes). If we copy the whole function,
we don't even need that any more ...
if kprobes can do it, I don't see why djprobes can't ... after all, it
just seems to use kprobes to insert a jump, AFAICS.