This is the mail archive of the systemtap@sourceware.org mailing list for the systemtap project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

Re: boosting with preemptable kernel

From: Quentin Barnes <qbarnes at urbana dot css dot mot dot com>
To: Masami Hiramatsu <masami dot hiramatsu dot pt at hitachi dot com>
Cc: Satoshi Oshima <soshima at redhat dot com>, Hideo Aoki <haoki at redhat dot com>, Yumiko Sugita <yumiko dot sugita dot yf at hitachi dot com>, SystemTAP <systemtap at sources dot redhat dot com>
Date: Sat, 27 Jan 2007 00:37:42 -0600
Subject: Re: boosting with preemptable kernel
References: <20061117235807.GB11523@urbana.css.mot.com> <45624C0D.3030007@hitachi.com> <20070103223912.GA13162@urbana.css.mot.com> <45A99FC3.5050805@hitachi.com>

Aside from RT related impact issues the GC causes, it also degrades
poorly on SMP systems.  It causes the entire system across all CPUs
to be forced to stop executing and all held into an idle state
simultaneously.  The reason for having multiple CPUs is so that a
thread tying up one CPU doesn't impact the performance (much) of
the rest of the system.  Now we have a single thread being able to
impact the performance across all CPUs system-wide simultaneously.
(If my understanding of the GC implementation is wrong, please
correct me.  It's only from my limited understanding of reading and
following the code, not from using it.)


I think you are a bit misreading the GC implementation.
First, this GC is never invoked when the kprobe is hit.
This GC may be invoked when you register/unregister a kprobe.
I think these operations are not frequently done and it will
be done when the module is loading/unloading.

I had followed all that.

Next, this GC will be invoked if there ARE some garbage(dirty)
slots when you get/free an insn slot. Thus, if your kprobe clean
its slot up before release it, your kprobe NEVER invokes the GC.


Ah, I did not follow this.  I see now.  The variable
"kprobe_garbage_slots" keeps track of the dirty count to avoid
invoking the GC when the count is zero.

What I would recommend is that the GC and its hooks be able to be
conditionally enabled or disabled with an ifdef, probably in the
arch kprobes.h file like insn slot and kretprobes are now.  That way
for architectures that don't need it by having their own alternate
approach can choose not to use it.  Would that be a reasonable
suggestion?


Currently, only the slots used by boosted kprobes are dirty.
So, on the ARM archtecture, if you'd like disable the GC,
you just call free_insn_slot with dirty=0, as below.
free_insn_slot(insn_slot, 0);


Yes, when the ARM code is updated to have the second parameter
to free_insn_slot(), it will always be 0,

on i386, you can prohibit boosting kprobes by specifying a void
post_handler to the kprobes. And then, the GC will not work any more.

Having a post-handler defeats boosting which then "defeats" the GC.

For our platform use though, we're ARM only.

If you feel a strong need for disabling GC at compiling,
I can add the compile flag for disabling GC.


Since the GC will be inactive for ARM with only a trivial execution
charge to check and maintain the "kprobe_garbage_slots" variable,
it's not that big a deal.  The amount of dead code is pretty minor
too.  At this point I'd say it's not worth putting a compile time
flag around.

Or,
could you write the trampoline routine and the
compile flag for switching the trampoline and the GC?

In my approach, there's no single "trampoline" in the classic sense.

The kprobe'd instruction's effects always complete before
kprobe_handler() returns, so there is no system state to hold and
manage across re-entry for the same kprobe's continued processing.
This allows kprobe_handler() and its associated management functions
and saved data state to be greatly simplified and reduced.  There's
no way to just "throw a switch" to put all that complexity back.  It
would almost have to be two completely different implementations.

About your last bullet mentioning djprobes, the ARM implementation
of kprobes I'm working on does have support for kretprobes and
jprobes, but not djprobes.  I have read your post from Oct '05 on
djprobes and understand it in a general way, but haven't yet wrapped
my mind fully around it.  Could you explain why such an approach
doesn't work for djprobes on x86?  Do you imagine that assertion
would hold for other non-x86 architectures as well?


Djprobe has to rewrite multiple instructions on i386 (CISC), because
the size of the long jump instruction is bigger than many instructions.
Before rewriting those instructions, we must ensure no other processes
running/sleeping on the target instructions which will be rewritten by
the long jump. This case can not be helped by the trampoline routine.


Yes, that's a very, very messy problem on the x86.  Up to five(?)
instructions could be overwritten by the long jump instruction.

I'm still not sure though I see why a trampoline approach wouldn't
work for x86.  It would just have to iterate the trampoline up to
five times.  But I still don't have the model clear enough in my
head yet.  Maybe it will become clearer over time.

As long as the kernel text address space stays under 32MB, an
ARM djprobe implementation would be a one-for-one instruction
replacement.

I'm still absorbing the djprobes explanation (djprobe-20051031.txt)
and perusing the patch you sent out Nov 21.  Sorry if the following
question has already been discussed.  If it has, just point me to
it.  Is there a reason djprobes needs its own, separate interface?
Could it just use the kprobes registration service and have the
kprobes code decide whether to implement a given probe as a kprobe
with an exception or djprobe with a direct jump?  Or is this a long
term goal after shaking out the djprobes model?

Best regards,

--
Masami HIRAMATSU
Linux Technology Center
Hitachi, Ltd., Systems Development Laboratory
E-mail: masami.hiramatsu.pt@hitachi.com

Quentin

Follow-Ups:
- Re: boosting with preemptable kernel
  - From: Masami Hiramatsu

References:
- Re: boosting with preemptable kernel
  - From: Quentin Barnes
- Re: boosting with preemptable kernel
  - From: Masami Hiramatsu

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]