This is the mail archive of the
systemtap@sourceware.org
mailing list for the systemtap project.
Re: [RFC] Systemtap translator support for hardware breakpoints on
> If my understanding is correct, this is a suggestion that demands an
> 'overcommit' feature (ability to accept requests more than the available
> debug registers) in hw-breakpoints, right?
Actually that's exactly not what I was talking about. It's an
interesting subject to consider (I was the one who originally
proposed such complexity for hw_breakpoint at its inception).
But all I was talking about here is what a stap module can do when a
"allocate it now and forever" call fails. (This might be at script
startup time, or might be later during a module's lifetime if the
putative script-driven dynamic registration feature were used.)
> In its new form (post perf-events integration), hw-breakpoints can
> indeed accept new requests that far exceed the number of underlying
> debug registers. This can be achieved by making an 'un-pinned' breakpoint
> request, where every such request gets a chance to use the debug
> register in a round-robin fashion (all this is provided by perf-events
> infrastructure anyway).
I don't understand what "round-robin" means for breakpoints. When
you let the kernel execute normally again after registration, either
my breakpoint is enabled or it isn't. There is no meaningful sense
in which you can "time-share" a hardware breakpoint slot. (That is,
except for doing per-task registrations, which is a different
semantics entirely.)
> Presently, the breakpoint infrastructure does not provide callbacks that
> can be invoked whenever an 'un-pinned' breakpoint request is
> scheduled-in/out (analogous to .enabled and .disabled). We could pursue
> to get support for the same (of course, that would require a good
> in-kernel user to convince the community!).
I am having trouble imagining what any kind of "scheduled-in/out"
could possibly useful to do at all if you don't notify me about
whether or not my breakpoint is in place. A "best effort"
breakpoint, that might be caught and might not be and who knows
whether it's really installed, is just not useful. I must be
missing the essence of what you mean.
The thing I had talked about before was each hw_breakpoint
registration having a priority number. When another registration
comes along with a higher priority, it can boot yours and make a
callback so you know it's been stolen. Conversely, when a competing
registration goes away, the highest-priority registration-in-waiting
gets installed and gets a callback to tell you it's now active. (I
guess really there could just be a single notifier list that gets
called when a slot becomes free, so the suitors can try again to see
whose priority wins. Whatever.) But you've said this is not what
it does now.
Anyway, that kind of dynamicism is not what I was talking about here.
If we had it, the script-language features might look rather similar
and so work how you're talking about here. i.e., the .enabled and
.disabled probes firing due to an hw_breakpoint layer callback that is
"spontaneous" to the stap module's eyes, i.e. driven by the ebb and
flow of external demand on the scarce shared resource.
In the examples I gave, the .unavailable sub-probe in the simplest
form is just a translation-time (tapset sugar) way to do some implicit
script-level stuff at 'probe begin' time, but conditional on whether
the associated hw_breakpoint registration at startup succeeded.
In the example with dynamic (i.e. runtime script-controlled)
registration, the .begin and .end sub-probes are just tapset sugar for
some script-level stuff to do implicitly when script code uses "enable
watch_foo" and "disable watch_foo". For that use, you might have a
.unavailable sub-probe that runs when "enable" fails, or you might
just have "enable" return a testable success/failure to the script
code or whatnot.
Given all that, I can imagine wanting the tie-in for some kind of
"breakpoint scheduling" to be that "enable" has "enable-now-or-fail"
and "enable-now-or-later" variants. Then in the "now or later"
variant, those same .begin and .end probes might get run during the
lifetime of a script (due to an hw_breakpoint layer callback) rather
than only at "enable foo" and "disable foo" time.
Thanks,
Roland