This is the mail archive of the
systemtap@sourceware.org
mailing list for the systemtap project.
Re: [RFC][Patch 2/2][take2]kprobe: kprobe-booster against 2.6.14-mm1 for i386
On Tue, Dec 20, 2005 at 11:01:39AM -0800, Keshavamurthy Anil S wrote:
> On Tue, Dec 20, 2005 at 10:21:14PM +0900, Masami Hiramatsu wrote:
> > Hi, Anil
> >
> > Masami Hiramatsu wrote:
> > >>Issue No 1:
> > >> In the above code patch, once you call reset_current_kprobes() and
> > >>preempt_enable_no_resched() ( I see preeempt_enable_no_resched()
> > >>in your fix to this code), you are not suppose to relay on
> > >>&p->ainsn.insn, the reason is that, on the other cpu technically
> > >>this memory can get freed due to unregister_kprobe() call.
> > >>In other words, by the time this cpu tries to execte the instruction
> > >>at regs->eip, that memory might have gone and system is bound to crash.
> > >
> > > I think old djprobe has the same issue. So, it will be solved by using
> > > safety check routine after removing breakpoint. Current kprobe uses
> > > synchronize_sched() for waiting rcu processing in unregister_kprobe().
> > > In my opinion, the solution is that introducing safety check same as
> > > djprobe instead of synchronize_sched() in unregister_kprobe().
> >
> > I found the rcu_barrier() function that works similar to djprobe's
> > safety check routine.
> >
> > > The safety check routine of djprobe waits until all cpus execute the works,
> > > and the works are executed from the keventd.
> >
> > The rcu_barrier() function waits until all cpus have gone through a
> > quiescent state.
> >
> > - The CPU performs a process switch.
> > - The CPU starts executing in User Mode.
> > - The CPU executes the idle loop
> > In each of these cases, We say that the CPU has gone through
> > a quiescent state.
> >
> > If we call rcu_barrier() after "int3" is removed, we can ensure
> > followings after the rcu_barrier() called.
> > - interrupt handlers are finished on all CPUs.
> > - p->ainsn.insn is not executed on all CPUs.
> > And we can remove a boosted kprobe safely.
> >
> > Current kprobes uses synchronize_sched(). I think this function
> > checks safety of only current CPU. So, it may not ensure above
> > conditions. But rcu_barrier() checks safety of all CPUs. So, it
> > can ensure above conditions.
> > How would you think about this?
>
> You are correct, we need to ensure safety of all CPUs.
> rcu_barrier() will work.
Unless I've understood RCU incorrectly, rcu_barrier() is not needed in
the kprobes case. As I read the code, you'll need to use rcu_barrier()
in cases where you have to run the completion routine on each online cpu.
All we care about in kprobes case is that the other cpus don't hold a
reference to the just removed kprobe, so we can complete the kprobe
unregistration. Return from synchronize_sched() ensures that all the
other cpus have passed through a quiescent state (one of the conditions
Masami has listed above).
Also remember we don't define our own callback function either.
Ananth