This is the mail archive of the
systemtap@sourceware.org
mailing list for the systemtap project.
Re: kprobe fault handling
On Thu, 2006-02-09 at 13:35 -0800, Jim Keniston wrote:
> > + /*
> > + * In case the user-specified fault handler returned zero,
> > + * try to fix up.
> > + */
> > +
> > + if (fixup_exception(regs))
> > + return 1;
>
> I think it's OK to call fixup_exceptions() here, but I believe it's
> redundant. I understood Suparna to say
> (http://sourceware.org/ml/systemtap/2006-q1/msg00423.html) that if we
> return 0, do_page_fault() will call fixup_exceptions() instead of trying
> to bring in the missing page (since it's a kernel instruction -- in a
> handler -- that faulted). Her explanation made sense to me.
But experimentally things don't work the way they should.
I see lots of these
Feb 2 23:15:53 monkey2 kernel: Debug: sleeping function called from
invalid context at mm/page_alloc.c:618
Feb 2 23:15:53 monkey2 kernel: in_atomic():0[expected: 0],
irqs_disabled():1
Feb 2 23:15:53 monkey2 kernel: [<c011df50>] __might_sleep+0x7d/0x89
Feb 2 23:15:53 monkey2 kernel: [<c014b802>] __alloc_pages+0x3a/0x2f7
Feb 2 23:15:53 monkey2 kernel: [<c0157a48>] do_no_page+0x55/0x3bf
Feb 2 23:15:53 monkey2 kernel: [<c011a19e>] pte_alloc_one+0x18/0x49
Feb 2 23:15:53 monkey2 kernel: [<c015553d>] pte_alloc_map+0x66/0x12d
Feb 2 23:15:53 monkey2 kernel: [<c0157f6d>] handle_mm_fault+0xb0/0x1fd
Feb 2 23:15:53 monkey2 kernel: [<c011a8ed>] do_page_fault+0x1ac/0x4dc
Feb 2 23:15:53 monkey2 kernel: [<c02ab2c1>] sock_aio_write+0x106/0x113
Feb 2 23:15:53 monkey2 kernel: [<c0119263>] kprobe_exceptions_notify
+0xc6/0x123
Feb 2 23:15:53 monkey2 kernel: [<c011a741>] do_page_fault+0x0/0x4dc
Feb 2 23:15:53 monkey2 kernel: [<c030fa4f>] error_code+0x2f/0x38
Feb 2 23:15:53 monkey2 kernel: [<c01e6028>] __copy_from_user_ll
+0x30/0x48
Feb 2 23:15:53 monkey2 kernel: [<e0b9dac7>] _stp_copy_from_user
+0x2d/0x4f [copy]
Feb 2 23:15:53 monkey2 kernel: [<c0168211>] sys_read+0x0/0x62
Feb 2 23:15:53 monkey2 kernel: [<e0b9dc40>] inst_sys_read+0x15/0x45
[copy]
Feb 2 23:15:53 monkey2 kernel: [<c0119020>] kprobe_handler+0x1f0/0x230
Feb 2 23:15:53 monkey2 kernel: [<c01191ce>] kprobe_exceptions_notify
+0x31/0x123
Feb 2 23:15:53 monkey2 kernel: [<c0130c59>] notifier_call_chain
+0x17/0x2e
Feb 2 23:15:53 monkey2 kernel: [<c01076f7>] do_int3+0x3d/0xcf
Feb 2 23:15:53 monkey2 kernel: [<c0143260>] audit_syscall_entry
+0x124/0x13d
Feb 2 23:15:53 monkey2 kernel: [<c030fbaf>] int3+0x1f/0x30
Feb 2 23:15:53 monkey2 kernel: [<c0168212>] sys_read+0x1/0x62
Feb 2 23:15:53 monkey2 kernel: [<c030f8cb>] syscall_call+0x7/0xb
Feb 2 23:15:53 monkey2 kernel: [<c030007b>] xfrm_policy_gc_kill
+0x39/0x68
The above only happens on non-smp machines. On SMP, I usually get
crashes. Putting the fixup_exception() call in got rid of the messages
and crashes for me.
That's as far as I have investigated.