This is the mail archive of the systemtap@sourceware.org mailing list for the systemtap project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: thoughts about exception-handling requirements for kprobes






- -
Richard J Moore
IBM Advanced Linux Response Team - Linux Technology Centre
MOBEX: 264807; Mobile (+44) (0)7739-875237
Office: (+44) (0)1962-817072

systemtap-owner@sourceware.org wrote on 20/03/2006 08:44:00:

>
>
>
>
> systemtap-owner@sourceware.org wrote on 19/03/2006 17:24:54:
>
> > On Fri, Mar 17, 2006 at 01:50:57PM -0800, Keshavamurthy Anil S wrote:
> > > On Thu, Mar 09, 2006 at 07:57:18AM -0800, Richard J Moore wrote:
> > > >
> > > >    I've been thinking about the need for exception-handling and how
> the
> > > >    current implementation has become a little muddled.
> > >
> > > Here is my thinking on this kprobe fault handling...
> > > Ideally we want the ability to recover from all
> > > the page faults happening from either pre-handler
> > > or happening from post-handler transparently in the
> > > same way as the normal kernel would recover from
> > > do_page_fault() function. In order for this to happen,
> > > I think we should not be calling pre-handler/post-handler
> > > by disabling preempt which is a major design change.
> > > Also in the current code if fixup_exception() fails to
> > > fixup the exception then falling back on the normal
> > > do_page_fault() is a bad thing with preempt disabled.
> > >
> > > I was thinking on this issue for the past several days
> > > and I believe that currently we are disabling preempt
> > > before calling pre/post handler, because we don;t
> > > want the process to get migrated to different CPU
> > > and we don't want another process to be scheduled
> > > while we are servicing kprobe as the newly scheduled
> > > process might trigger another probe and we don;t
> > > have space to save the kprobe control block(kprobe_ctlbk)
> > > info, because we save kprobe_ctlbk in the per cpu structure.
> > >
> > > If we move this saving kprobe_ctlbk to task struct then
> > > I think we will have the ability to call pre/post-handler
> > > without having to disable preempt and their by any faults
> > > happening from either pre/post handler can recover transparently
> > > in the same way as the normal kernel would recover.
> > >
> >
> > Kprobes user-specified pre/post handler are called within
> > the interrupt context and if we allow page faults while within
>
> Clarify what you mean by "allow"
>
> > user-specified pre/post handler, then it might sleep.
>
> Clarify what you mean by "it"
>


No I understand the context to this remark (thanks Prasanna).

So yes, one cannot allow slwwping in a probe handler in general.
However it might be possible to introduce a scheme to allow page-in under
limited circumstances.
I've seen kernel debuggers do this.
For example,  if we took a probepoint in user-space then the probe handler
doesn't interrupt kernel processing
so in theory doen't recurse back into the kernel and cause lock contention.
Under this circumstace it might be possible to read the swap space
synchronously but that would prohibit having any probes in the code path
that did the page-in.
I have seen kernel debuggers do the following for the case where we do take
a breakpoint inside the kernel:
They effect a page-in by scheduling a kernel thread do it and then have
that thread breakin when the page-in is complete.
But essentially this is having two beak-ins to handle a page-in and we
might as well allow the normal page mgmt to do its stuff and when the
probed insruction is retired the same probe handler will fire and the
memory will now be present.

So, the only saving we could make is if we can identfy that the probe is in
a code path that does not own any kernel locks: user-space and some of
those "no mans land" areas of kernel space. Under these circumstances we
might be able to effect a synchronous page-in,  but that depends very much
on the design of the Linux page/swap manager. We might have to make a
significant modification to that component to support doing this. We might
also have to modify some kernel locks - at least those which when held
would will prohibit recursion into page/swp mgmt - to allow them to be
identified as being currently owned.
It s possible, but is it practicalble?

Richard


>
> Richard
>
> .
> .
>
>
> > --
> > Prasanna S Panchamukhi
> > Linux Technology Center
> > India Software Labs, IBM Bangalore
> > Email: prasanna@in.ibm.com
> > Ph: 91-80-51776329
>


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]