This is the mail archive of the systemtap@sourceware.org mailing list for the systemtap project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [RFC][PATCH 00/14] function_graph: Rewrite to allow multiple users


On Thu, 29 Nov 2018 22:24:35 -0500
Steven Rostedt <rostedt@goodmis.org> wrote:

> On Fri, 30 Nov 2018 11:26:58 +0900
> Masami Hiramatsu <mhiramat@kernel.org> wrote:
> 
> > On Thu, 29 Nov 2018 11:46:52 -0500
> > Steven Rostedt <rostedt@goodmis.org> wrote:
> > 
> > > On Thu, 29 Nov 2018 23:29:27 +0900
> > > Masami Hiramatsu <mhiramat@kernel.org> wrote:
> > >   
> > > > > One way to solve this is to also have a counter array that gets updated
> > > > > every time the index array gets updated. And save the counter to the
> > > > > shadow stack index as well. This way, we only call the return if the
> > > > > counter on the stack matches what's in the counter on the counter array
> > > > > for the index.    
> > > > 
> > > > Hmm, but we already know the current stack "header" entry when calling
> > > > handlers, don't we? I thought we just calcurate out from curr_ret_stack.  
> > > 
> > > Basically we have this:
> > > 
> > >  array: | &fgraph_ops_1 | &fgraph_ops_2 | &fgraph_ops_stub | ...
> > > 
> > > On entry of function we do:
> > >   
> > 	push header(including original ret_addr) onto ret_stack
> 
> We can't put the ret_addr of the callback on the stack. What if that
> ret_addr is a module, and it gets unloaded? We must not call it.

But in that case, how can we recover the original addr on the kernel (real)
stack? I don't call the entry, but kretprobe handler will need the info
to record as a caller-address.

> > 
> > > 	for (i = 0; i < array_entries; i++) {
> > > 		if (array[i]->entryfunc(...)) {
> > > 			push i onto ret_stack;
> > > 		}
> > > 	}
> > > 
> > > On the return side, we do:
> > > 
> > > 	idx = pop ret_stack;
> > > 
> > > 	array[idx]->retfunc(...);  
> > 
> > So at this point we have the header on ret_stack, don't we? :)
> > 
> > Anyway, I think we may provide an API for unwinder to find correct
> > original return address form ret_stack. That is OK for me.
> 
> Yes. In fact, I have something that worked for that. I'll have to test
> it some more.

Great! I think it will be enough for kretprobe.

> > > > I need only sizeof(unsigned long). If the kretprobe user requires more,
> > > > it will be fall back to current method -- get an "instance" and store
> > > > its address to the entry :-)  
> > > 
> > > Awesome, then this shouldn't be too hard to implement.  
> > 
> > Oops, anyway I noticed that I must store a value on each area so that we can
> > identify which kretprobe is using that if there are several kretprobes on same
> > function. So, kretprobe implementation will be something like below.
> > 
> > kretprobe_retfunc(trace, regs)
> > {
> > 	kp = get_kprobe(trace->func);
> > 
> > 	if (private == to_kretprobe(kp)) // this is directly mapped to current kprobe
> > 		goto found_kretprobe;
> > 
> > 	if (!list_empty(&kp->list)) {	// we need to find from multiple kretprobes
> > 		list_for_each_entry(kp, &kp->list, list)
> > 			if (private == kp)
> > 				goto found_kretprobe;
> > 	}
> > 
> > 	// Or this must be an instance
> > 	struct kretprobe_instance *ri = trace->private;
> > 	rp = ri->rp;
> > 	if (valid_kretprobe(rp))
> > 		rp->handler(ri, regs);
> > 	kretprobe_recycle_instance(ri);
> > 	goto out;
> > 
> > found_kretprobe:
> > 	struct kretprobe_instance rii = {.rp = to_kretprobe(kp),
> > 		.ret_addr=trace->ret, .task = current}
> > 	rp->handler(&rii, regs);
> > 
> > out:
> > 	return 0;
> > }
> > 
> > I think we talked about pt_regs, which is redundant for return probe, so it should
> > be just a return value. (but how we pass it? trace->retval?)
> 
> Yeah, we can add that.

OK, then I will start with making a fake pt_regs on stack and call handler,
which will be something like,

	struct pt_regs regs = {};
	regs_set_return_value(&regs, trace->retval);
	rp->handler(ri, &regs);

Thank you,

> > That is OK for ftrace (but the transition needs more code).
> > And I would like to ask ebpf and systemtap people that is OK since it will change
> > the kernel ABI.
> 
> I agree.
> 
> -- Steve


-- 
Masami Hiramatsu <mhiramat@kernel.org>


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]