This is the mail archive of the
systemtap@sourceware.org
mailing list for the systemtap project.
Re: [RFC][PATCH 00/14] function_graph: Rewrite to allow multiple users
- From: Steven Rostedt <rostedt at goodmis dot org>
- To: Masami Hiramatsu <mhiramat at kernel dot org>
- Cc: linux-kernel at vger dot kernel dot org, Ingo Molnar <mingo at kernel dot org>, Andrew Morton <akpm at linux-foundation dot org>, Thomas Gleixner <tglx at linutronix dot de>, Peter Zijlstra <peterz at infradead dot org>, Josh Poimboeuf <jpoimboe at redhat dot com>, Frederic Weisbecker <frederic at kernel dot org>, Joel Fernandes <joel at joelfernandes dot org>, Andy Lutomirski <luto at kernel dot org>, Mark Rutland <mark dot rutland at arm dot com>, systemtap at sourceware dot org, Alexei Starovoitov <ast at kernel dot org>, Daniel Borkmann <daniel at iogearbox dot net>
- Date: Thu, 29 Nov 2018 22:24:35 -0500
- Subject: Re: [RFC][PATCH 00/14] function_graph: Rewrite to allow multiple users
- References: <20181122012708.491151844@goodmis.org> <20181126182112.422b914dd00ecb36e15f7b07@kernel.org> <20181126113215.4259d473@gandalf.local.home> <20181129232927.74ca5f294e97fc58b15bf8c6@kernel.org> <20181129114652.3696d6d7@gandalf.local.home> <20181130112658.337e92b79e243034973b6997@kernel.org>
On Fri, 30 Nov 2018 11:26:58 +0900
Masami Hiramatsu <mhiramat@kernel.org> wrote:
> On Thu, 29 Nov 2018 11:46:52 -0500
> Steven Rostedt <rostedt@goodmis.org> wrote:
>
> > On Thu, 29 Nov 2018 23:29:27 +0900
> > Masami Hiramatsu <mhiramat@kernel.org> wrote:
> >
> > > > One way to solve this is to also have a counter array that gets updated
> > > > every time the index array gets updated. And save the counter to the
> > > > shadow stack index as well. This way, we only call the return if the
> > > > counter on the stack matches what's in the counter on the counter array
> > > > for the index.
> > >
> > > Hmm, but we already know the current stack "header" entry when calling
> > > handlers, don't we? I thought we just calcurate out from curr_ret_stack.
> >
> > Basically we have this:
> >
> > array: | &fgraph_ops_1 | &fgraph_ops_2 | &fgraph_ops_stub | ...
> >
> > On entry of function we do:
> >
> push header(including original ret_addr) onto ret_stack
We can't put the ret_addr of the callback on the stack. What if that
ret_addr is a module, and it gets unloaded? We must not call it.
>
> > for (i = 0; i < array_entries; i++) {
> > if (array[i]->entryfunc(...)) {
> > push i onto ret_stack;
> > }
> > }
> >
> > On the return side, we do:
> >
> > idx = pop ret_stack;
> >
> > array[idx]->retfunc(...);
>
> So at this point we have the header on ret_stack, don't we? :)
>
> Anyway, I think we may provide an API for unwinder to find correct
> original return address form ret_stack. That is OK for me.
Yes. In fact, I have something that worked for that. I'll have to test
it some more.
> > > I need only sizeof(unsigned long). If the kretprobe user requires more,
> > > it will be fall back to current method -- get an "instance" and store
> > > its address to the entry :-)
> >
> > Awesome, then this shouldn't be too hard to implement.
>
> Oops, anyway I noticed that I must store a value on each area so that we can
> identify which kretprobe is using that if there are several kretprobes on same
> function. So, kretprobe implementation will be something like below.
>
> kretprobe_retfunc(trace, regs)
> {
> kp = get_kprobe(trace->func);
>
> if (private == to_kretprobe(kp)) // this is directly mapped to current kprobe
> goto found_kretprobe;
>
> if (!list_empty(&kp->list)) { // we need to find from multiple kretprobes
> list_for_each_entry(kp, &kp->list, list)
> if (private == kp)
> goto found_kretprobe;
> }
>
> // Or this must be an instance
> struct kretprobe_instance *ri = trace->private;
> rp = ri->rp;
> if (valid_kretprobe(rp))
> rp->handler(ri, regs);
> kretprobe_recycle_instance(ri);
> goto out;
>
> found_kretprobe:
> struct kretprobe_instance rii = {.rp = to_kretprobe(kp),
> .ret_addr=trace->ret, .task = current}
> rp->handler(&rii, regs);
>
> out:
> return 0;
> }
>
> I think we talked about pt_regs, which is redundant for return probe, so it should
> be just a return value. (but how we pass it? trace->retval?)
Yeah, we can add that.
> That is OK for ftrace (but the transition needs more code).
> And I would like to ask ebpf and systemtap people that is OK since it will change
> the kernel ABI.
I agree.
-- Steve