This is the mail archive of the systemtap@sourceware.org mailing list for the systemtap project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [RFC][Patch 2/2][take2]kprobe: kprobe-booster against 2.6.14-mm1 for i386


On Tue, Dec 20, 2005 at 11:01:39AM -0800, Keshavamurthy Anil S wrote:
> On Tue, Dec 20, 2005 at 10:21:14PM +0900, Masami Hiramatsu wrote:
> > Hi, Anil
> > 
> > Masami Hiramatsu wrote:
> > >>Issue No 1:
> > >>	In the above code patch, once you call reset_current_kprobes() and
> > >>preempt_enable_no_resched() ( I see preeempt_enable_no_resched()
> > >>in your fix to this code), you are not suppose to relay on
> > >>&p->ainsn.insn, the reason is that, on the other cpu technically
> > >>this memory can get freed due to unregister_kprobe() call.
> > >>In other words, by the time this cpu tries to execte the instruction
> > >>at regs->eip, that memory might have gone and system is bound to crash.
> > > 
> > > I think old djprobe has the same issue. So, it will be solved by using
> > > safety check routine after removing breakpoint. Current kprobe uses
> > > synchronize_sched() for waiting rcu processing in unregister_kprobe().
> > > In my opinion, the solution is that introducing safety check same as
> > > djprobe instead of synchronize_sched() in unregister_kprobe().
> > 
> > I found the rcu_barrier() function that works similar to djprobe's
> > safety check routine.
> > 
> > > The safety check routine of djprobe waits until all cpus execute the works,
> > > and the works are executed from the keventd.
> > 
> > The rcu_barrier() function waits until all cpus have gone through a
> > quiescent state.
> > 
> > - The CPU performs a process switch.
> > - The CPU starts executing in User Mode.
> > - The CPU executes the idle loop
> > In each of these cases, We say that the CPU has gone through
> > a quiescent state.
> > 
> > If we call rcu_barrier() after "int3" is removed, we can ensure
> > followings after the rcu_barrier() called.
> > - interrupt handlers are finished on all CPUs.
> > - p->ainsn.insn is not executed on all CPUs.
> > And we can remove a boosted kprobe safely.
> > 
> > Current kprobes uses synchronize_sched(). I think this function
> > checks safety of only current CPU. So, it may not ensure above
> > conditions. But rcu_barrier() checks safety of all CPUs. So, it
> > can ensure above conditions.
> > How would you think about this?
> 
> You are correct, we need to ensure safety of all CPUs.
> rcu_barrier() will work.

Unless I've understood RCU incorrectly, rcu_barrier() is not needed in
the kprobes case. As I read the code, you'll need to use rcu_barrier()
in cases where you have to run the completion routine on each online cpu.

All we care about in kprobes case is that the other cpus don't hold a
reference to the just removed kprobe, so we can complete the kprobe
unregistration. Return from synchronize_sched() ensures that all the
other cpus have passed through a quiescent state (one of the conditions
Masami has listed above).

Also remember we don't define our own callback function either.

Ananth


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]