This is the mail archive of the systemtap@sourceware.org mailing list for the systemtap project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: notify_page_fault() problem


On Tue, May 01, 2007 at 04:57:00AM +0200, Andi Kleen wrote:
However, vmalloc_sync_all() is i386 and x86_64 specific as well
as their change to register_page_fault_notifier(). I don't see
other platform doing anything else doing anything special in their
register_page_fault_notifier().

They probably just haven't tested this particular case yet.

I got it to happen most often when running the syscall.exp test. But it was still very intermittent though. I'm guessing it has to do with what else got placed on the same page with the kprobe/kretprobe data structure (so it would occasionally get coincidentally loaded and work) and if the system is running preempt-enabled and how busy it is to have another page fault occur before the kprobes data structure could get its translation fault to happen. If the system is quiescent, this bug's not going to show up either. I'm currently running my lowly 64MB ARM board with network boot _and_ swap drives so a lot system pounding is going on most all the time.

x86 also did it originally to handle NMI notifiers, which is a
x86 special (nested pagefault in NMI can lead to stack corruption
because NMIs are only blocked until the next IRET)

Ah, ok.


I have trouble believing that x86
and ARM are unique somehow with needing to address this problem.
Why doesn't anyone else hit this?  Is it a lurking problem or are
there other fixes in other forms out there?

The standard kprobes notifier is not modular so it won't hit this.

Are you saying the code in arch/*/kernel/kprobes.c and kernel/kprobes.c is not marked as a modular so it won't hit this problem?

That doesn't matter.  Just having the kprobes and kretprobes data
structures being in module memory is all that matters.  What
happens is when get_kprobe() and aggr_pre_handler() walk the the
kprobe_table[] list and they stumble across a kprobe or kretprobe
data structure referenced in the table that's not mapped in hardware
yet.  That's what's generating the recursive faults I was seeing.

I guess part of the answer has to do with what people's expectations
are for intercepting faults with their kprobes fault handler though.

Yes, some have pretty broad exceptions. It might be possible to move it to a kernel address only path, but then some debuggers seem to want to debug user mode too.

It would be nice to move those debugger hooks out of there and have the debuggers use kprobes so their needs don't negatively impact the system as a whole even when they're not in use.

But you're right there has been grumbling about the overhead
of the notifier call in the hot path.

I wasn't aware of any grumbling. I'm not on the main kernel mailing list. I was just disappointed though to see that the fault handlers are being notified for every single fault in the system, user or kernel space. While tracking down this bug, my debug logs were huge just from having a user land app fault in some shared libraries. Just a few instructions in a system's fault handler path can have noticable performance repercussions.

-Andi

Quentin



Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]