This is the mail archive of the
systemtap@sourceware.org
mailing list for the systemtap project.
Re: [RFC V2] Systemtap translator support for kernel hardware breakpoints
Prerna Saxena <prerna@linux.vnet.ibm.com> writes:
> [...] The basic constructs intorduced earlier still stand :
>
> probe kernel.data(ADDRESS).write
> probe kernel.data(ADDRESS).rw
> probe kernel.data(ADDRESS).length(LEN).write
> probe kernel.data(ADDRESS).length(LEN).rw
> probe kernel.data("SYMBOL_NAME").write
> probe kernel.data("SYMBOL_NAME").rw
OK.
> The patch is based on the mainline kernel. The above constructs do not
> depend on dwarf, and hence can be used even on kernels lacking
> debuginfo.
OK. The translator can detect the absence or presence of this
functionality, and should preclude translation without it. Use
systemtap_session.kernel_config["CONFIG_HAVE_HW_BREAKPOINTS"]
> 2. The script translation fails if the number of hardware breakpoint
> probes exceeds the number of debug registers for a given architecture.
OK, though I suggest making this part a warning only. This code has
no way to know what the actual limits are at the moment of execution.
They could easily be lower, and if perf etc. get some sort of magical
multiplexing, the limits could be higher. The runtime should (but
doesn't yet - see below) handle errors such as overreservations
correctly.
> 3. For now, probing of a kernel function's local variables is not
> possible. Also, with export of kallsyms_lookup_name(), systemtap
> generated module can directly decode symbol address at run-time. IMO,
> use of dwarf routines seems an overkill for now.
OK, assuming that this re-exported function stays re-exported, there
may be no need to do an autoconf-level check for it, if the code
already checks for CONFIG_HAVE_HW_BREAKPOINTS.
> [...]
> 3. Dynamic enabling / disabling a hardware breakpoint handler from a
> kprobe context. (This will in turn allow one to set breakpoints on
> local variables of kernel functions)
We hope to address this general area more comprehensively for systemtap-1.2.
> +void hwbkpt_derived_probe::join_group (systemtap_session& s)
> +{
> + if (! s.hwbkpt_derived_probes) {
> + s.hwbkpt_derived_probes = new hwbkpt_derived_probe_group ();
> + if (s.architecture == "i386" || s.architecture == "x86_64" )
> + s.hwbkpt_derived_probes->max_hwbkpt_probes_by_arch = 4;
> + else {
> + s.hwbkpt_derived_probes->max_hwbkpt_probes_by_arch = 0;
> + throw semantic_error ("Hardware Breakpoints not supported on arch " + s.architecture);
> + }
> + }
> + s.hwbkpt_derived_probes->enroll (this, s);
> +}
I suggest removing this error - or turning it into a (conditional) warning.
> + // Warn of misconfigured kernels
> + s.op->newline() << "#ifndef CONFIG_HAVE_HW_BREAKPOINT";
> + s.op->newline() << "#error \"Need CONFIG_HAVE_HW_BREAKPOINT!\"";
> + s.op->newline() << "#endif";
If the code also checks for the kernel_config[] above, this check
should never fail.
> +// Define macros for access type
> + s.op->newline() << "#define HWBKPT_READ 0";
> + s.op->newline() << "#define HWBKPT_WRITE 1";
> + s.op->newline() << "#define HWBKPT_RW 2";
Why, instead of just using the symbols already defined in the
kernel headers?
> + size_t pp_name_max = 0 , symbol_name_max = 0;
> + size_t pp_name_tot = 0 , symbol_name_tot = 0;
> + for (unsigned int it = 0; it < hwbkpt_probes_vector.size(); it++)
> + {
> + hwbkpt_derived_probe* p = hwbkpt_probes_vector.at(it);
> +#define DOIT(var,expr) do { \
> + size_t var##_size = (expr) + 1; \
> + var##_max = max (var##_max, var##_size); \
> + var##_tot += var##_size; } while (0)
> + DOIT(pp_name, lex_cast_qstring(*p->sole_location()).size());
> + DOIT(symbol_name, p->symbol_name.size());
> +#undef DOIT
> + }
> +
> +#define CALCIT(var) \
> + s.op->newline() << "const char " << #var << "[" << var##_name_max << "] ;";
> + CALCIT(pp);
> + CALCIT(symbol);
> +#undef CALCIT
Considering that we're likely to have a small handful of such probes
in a script, this is all unnecessary optimization. Just plop a char*
in there.
> + s.op->newline() << "#ifdef CONFIG_X86";
> + s.op->newline() << " switch(sdp->len) {";
> + s.op->newline() << " case 1:";
> + s.op->newline() << " hp->bp_len = HW_BREAKPOINT_LEN_1;";
> + s.op->newline() << " break;";
> + s.op->newline() << " case 2:";
> + s.op->newline() << " hp->bp_len = HW_BREAKPOINT_LEN_2;";
> + s.op->newline() << " break;";
> + s.op->newline() << " case 3:";
> + s.op->newline() << " case 4:";
> + s.op->newline() << " hp->bp_len = HW_BREAKPOINT_LEN_4;";
> + s.op->newline() << " break;";
> + s.op->newline() << "#ifdef CONFIG_X86_64";
> + s.op->newline() << " case 5:";
> + s.op->newline() << " case 6:";
> + s.op->newline() << " case 7:";
> + s.op->newline() << " case 8:";
> + s.op->newline() << " hp->bp_len = HW_BREAKPOINT_LEN_8;";
> + s.op->newline() << " break;";
> + s.op->newline() << "#endif /*CONFIG_X86_64*/";
Are you sure this CONFIG_arch* stuff is needed? Let the kernel
try 8-byte watchpoints, and let us handle the error.
> + s.op->newline() << "if ( !IS_ERR(stap_hwbkpt_ret_array[i]) ) {";
> + s.op->newline() << " stap_hwbkpt_ret_array[i] = register_wide_hw_breakpoint( hp, (void *)&enter_hwbkpt_probe );";
> + s.op->newline() << " if (IS_ERR(stap_hwbkpt_ret_array[i])) {";
> + s.op->newline() << " rc = PTR_ERR(stap_hwbkpt_ret_array[i]);";
> + s.op->newline() << " sdp->registered_p = 0;";
> + s.op->newline(1) << " _stp_warn(\" Hwbkpt Probe %s : Registration error rc= %d, Addr = %p, name = %s\", probe_point, rc, addr, hwbkpt_symbol_name);";
> + s.op->newline(-1) << " }";
> + s.op->newline() << " else sdp->registered_p = 1;";
> + s.op->newline() << "}";
> + s.op->newline(-1) << "}"; // for loop
I mentioned this in the previous review, but this is insufficient.
(Please review my earlier comments in case something else was missed.)
If the [i]th registration attempt fails, and you return an rc != 0 to
the wrapping code, this code right here must ensure that all
already-registered entries ([0..i-1]) have been unregistered.
> + s.op->newline() << " if ( IS_ERR(stap_hwbkpt_ret_array[i]) ) continue;";
> +// s.op->newline() << " if ( sdp->registered_p == 0 ) continue;";
Just use registered_p = 0 or 1. You don't need to archive the error
codes.
> + //Hwbkpt based probe
> + s.pattern_root->bind(TOK_KERNEL)->bind_num(TOK_HWBKPT)
> + ->bind(TOK_HWBKPT_WRITE)->bind(new hwbkpt_builder());
> + s.pattern_root->bind(TOK_KERNEL)->bind_str(TOK_HWBKPT)
> + ->bind(TOK_HWBKPT_WRITE)->bind(new hwbkpt_builder());
> + s.pattern_root->bind(TOK_KERNEL)->bind_num(TOK_HWBKPT)
> + ->bind(TOK_HWBKPT_RW)->bind(new hwbkpt_builder());
> + s.pattern_root->bind(TOK_KERNEL)->bind_str(TOK_HWBKPT)
> + ->bind(TOK_HWBKPT_RW)->bind(new hwbkpt_builder());
> + s.pattern_root->bind(TOK_KERNEL)->bind_num(TOK_HWBKPT)
> + ->bind_num(TOK_LENGTH)->bind(TOK_HWBKPT_WRITE)->bind(new hwbkpt_builder());
> + s.pattern_root->bind(TOK_KERNEL)->bind_num(TOK_HWBKPT)
> + ->bind_num(TOK_LENGTH)->bind(TOK_HWBKPT_RW)->bind(new hwbkpt_builder());
> + // length supported with address only, not symbol names
This is probably the easiest place to wrap with a kernel_config[...]
conditional to disable this probe point family on unsupporting kernels.
- FChE