This is the mail archive of the
mailing list for the systemtap project.
Re: Questions about kprobes implementation on SMP and preemptible kernels
- From: "Abhishek Sagar" <sagar dot abhishek at gmail dot com>
- To: "Quentin Barnes" <qbarnes at urbana dot css dot mot dot com>
- Cc: systemtap at sources dot redhat dot com, dwilder at us dot ibm dot com, grundym at us dot ibm dot com, ananth at in dot ibm dot com
- Date: Thu, 18 Jan 2007 23:01:45 +0530
- Subject: Re: Questions about kprobes implementation on SMP and preemptible kernels
- Domainkey-signature: a=rsa-sha1; c=nofws; d=gmail.com; s=beta; h=received:message-id:date:from:to:subject:cc:in-reply-to:mime-version:content-type:content-transfer-encoding:content-disposition:references; b=angf2NejXpT/TekU2Y38P6ufj1P6zKiL8fn8w8LhGBSZjOBWf+OGwXk9yolyPd+LFs2v3g5BWr9MsTgFmVB7fRCIGygS0aRdAqUtD2SdA8CPsEHvvgmw7rLGg1P4li6nCcTvNHVi5i+jUlHnJVSkkBzZVxShDScnStVOYfTL7Pg=
- References: <20070116034826.GA22002@urbana.css.mot.com> <20070116053955.GA14386@in.ibm.com> <20070118042921.GA26784@urbana.css.mot.com>
It seems to me that the preempt_disable()/preempt_enable() calls in
kprobe_exceptions_notify() are either at best inert, or possibly
hiding an existing bug.
The decision to sandwich kprobe_exceptions_notify between
preempt_disable()/preempt_enable() is to 'explicitly' disable
preemption. Since interrupts are already disabled, and there is no way
of being preempted, you're spot on in pointing out that they're not
really needed. But it seems more like a matter of convention to use
explicit preemption locks around preemption sensitive regions. So
yes...their use here is kind of inert.
Quoting from Documentation/preempt-locking.txt:
"...But keep in mind that 'irqs disabled' is a fundamentally unsafe way of
disabling preemption - any spin_unlock() decreasing the preemption count
to 0 might trigger a reschedule. A simple printk() might trigger a reschedule.
So use this implicit preemption-disabling property only if you know that the
affected codepath does not do any of this. Best policy is to use this only for
small, atomic code that you wrote and which calls no complex functions."
It goes on to add:
"It is possible to prevent a preemption event using local_irq_disable and
local_irq_save. Note, when doing so, you must be very careful to not cause
an event that would set need_resched and result in a preemption check. When
in doubt, rely on locking or explicit preemption disabling."
Seems like even though there are sections in the kernel which now seem
like non-preemptible (because they run with interrupts disabled), they
can be turned into preemptible sections in the future. The sections
which explicitly want to run in a preemption safe environment must
take proper preemption locks to safeguard themselves permanently.
kprobe_exceptions_notify is such a function because of its use of
per-cpu variables. So if in the future, the ARM undef exception
handling code becomes preemptibe, the per-cpu variable dependent
code-path in it (like kprobe_handler) would remain safe even then.
The confusion which now arises, is that the quote suggests that even
with local_irq_disable'd the current process can be rescheduled. But
having a look at preempt_schedule(), it shows the following check
right at the beginning:
if (likely(ti->preempt_count || irqs_disabled())) /* --->
notice the 'likely' optimization */
It doesn't seem like preemption can be triggered in scenario quoted
from preempt-locking.txt ...can somebody explain this?
I'm wondering if I found a bug on the ARM implementation of prefetch
and data abort exception handlers for SMP platforms with kernel
preemption enabled. Immediately after switching to SVC mode in
__pabt_svc and __dabt_svc, the handler re-enables IRQs (interrupts)
if they were enabled prior to the exception. If it re-enables
interrupts at this point, it seems to me that a preemptive kernel
(CONFIG_PREEMPT defined) could switch execution to another processor
breaking the chain of execution before it has a chance to note which
processor triggered the exception. Any ARM Linux kernel experts on
this list to comment, or do I need to bounce this to another list?
In the __irq_svc path (IRQ interrupt handling), the handler
increments the preempt count before reenabling IRQs (and later
restores its previous value after servicing the interrupt) when
built for a preemptive kernel.
The __irq_svc_ path explicitly disables preemption because
rescheduling is not an option in hard irq context. The abort mode
handling code-path however seems to have been opened up for preemption
(upon entry into SVC mode). In case of prefetch abort, the processor
sensitive information has already been gathered in registers r0 and r1
before which interrupts are enabled. So preempting them seems safe.
Similarly, in __dabt_svc, the processor specific data gathering
routines are called before interrupts are enabled. So preemption seems
safe there as well. I've just skimmed through that part of
entry-armv.S, so kindly correct me if I've missed something apparent.