This is the mail archive of the mailing list for the systemtap project.

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

function("*") probing

Hi -

Thanks to a well-working kexec/kdump setup on fc5, fc6, and rhel5,
I've made some progress in working out the reasons for the crashes we
encounter when indiscriminately probing kernel functions.  Often, the
stack traces include multiple nested faults, sometimes bottoming out
on some random error, sometimes on a hung lock.

One problem is an old bugaboo: reentrancy.  It turns out that many of
the locking primitives we sparingly use, which ideally should be
inlined, in fact turn into function calls.  The main bunch of problems
occurs when the locking-related kernel functions (_read_lock and many
pals) are themselves probed.  Putting these into the translator
blacklist makes a big difference.  I'm working on characterizing the
callees of increasingly complex probes, and am considering
blacklisting many of them.  I was under the impression that kprobes
tries to detect & prevent such reentrancy but perhaps that too needs
some work.

Another problem is our lack of self-throttling.  It is related to bug
#2685 ("skip probes on insufficient stack") but is more like the linux
nmi-watchdog looking at /proc/interrupts and what dtrace has: a way of
detecting excessive probing load, and consequential probe skipping or
outright session shutdown.  I just created bug #3545 for this part.

- FChE

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]