This is the mail archive of the systemtap@sources.redhat.com mailing list for the systemtap project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [RFC] Design + prototype: Multiple kprobes at an address - take 2


Cool - much nicer and further along the direction that is aligned
with the original kprobes design philosophy of keeping it simple.
Good progress, Ananth !

But, hmm, its not quite there yet. I think it needs a few more tries to
get it right - the probe insertion path can be improved, for
example - the layering isn't right - core kprobes shouldn't know
about aggr_probe even in register probes path.

Since I'm haven't and am not going to be able to closely follow
and review all the new patches coming up or the discussions for
that matter, I should probably explain this aspect of the design
philosophy a little more, because in general, I'm getting a little
worried about new features and flags slowly creeping into and bloating
the core.

That the original base kprobes patch was small (< 500 LOC even with
single stepping out-of-line) was not by mere chance, nor did it
come easily ... believe me, it took a lot of careful programming
and design effort. I remember all those iterations back and forth
with Rusty :) encouraging us to ruthlessly throw away all but what
was absolutely necessarly in the quest for perfection to keep the
core really and truly simple and minimal. I understand the
need for the new requirements, but I'd really hate to see us losing
that hard earned compactness and layering simplicity.

And this is why, to me, it makes sense to go the extra mile to
think hard enough and try to implement new requirements as addon
features _over_ kprobes, rather than modify the core, to the 
greatest extent possible.

Keeping things simple is hard, of course. But it is worth it, 
and hard doesn't mean impossible :)

Regards
Suparna

On Wed, Apr 06, 2005 at 10:48:21AM -0400, Ananth N Mavinakayanahalli wrote:
> Hi,
> 
> Based on Suparna's and Prasanna's comments on the earlier design, here
> goes the new design and prototype for handling multiple kprobes at an
> address. This design has the advantage that it is completely
> architecture agnostic and imposes (almost) no penalty on the most common
> usage - one kprobe per address. And as you can see, the patch is also
> considerably smaller :-)
> 
> NOTE:
> 
> 1. I haven't yet added Frank's suggested enabled/disabled flag in
> struct kprobe. With the new design, this needs some thought.
> 2. It is not possible to return a value from unregister_kprobe() since
> list_del() returns void.
> 
> Patch is against 2.6.11-rc1-mm4, but also applies against 2.6.12-rc2-mm1
> 
> Thanks,
> Ananth

> Draft of 6 April 2005.
> 
> 1	Multiple kprobes at an address
> 
> 1.1	Overview
> 
> One of the requirements of SystemTAP is the capability of defining 
> multiple kprobes per probe address. Current design of kprobes does 
> not allow the registration of more than one kprobe at an address.
> 
> This is an attempt to come up with a design that will enable a user
> to register multiple kprobes at a given address. 
> 
> 1.2	Definitions
> 
> aggregate probe: The higher level structure that will incorporate 
> one kprobe with handlers that have the list walk logic and a list
> of kprobes registered at the given address.
> 
> 
> 2	Design
> 
> 2.1	The aggregate probe structure
> 
> The new aggregate probe structure will be:
> 
> struct aggr_probe {
> 	struct kprobe kp;
> 	struct list_head kprobes;
> };
> 
> 2.2	struct kprobe
> 
> Struct kprobe will now have an additional "list" field to 
> facilitate list.h usage.
> 
> struct kprobe {
> 	struct hlist_head hlist;
> 	struct list_head list;
> 	kprobe_opcode_t *addr;
> 	kprobe_pre_handler_t pre_handler;
> 	kprobe_post_handler_t post_handler;
> 	kprobe_fault_handler_t fault_handler;
> 	kprobe_break_handler_t break_handler;
> 	kprobe_opcode_t opcode;
> 	struct arch_specific_insn ainsn;
> };
> 
> NOTE: It is always a good idea to set the handler fields not being 
> used explicitly to NULL. The design determines if a probe is a 
> jprobe by virtue of the fact that only jprobes define break_handlers. 
> 
> 
> 2.3	Base kprobes infrastructure
> 
> 2.3.1	Registering a kprobe
> 
> 1. Kprobes are registered with a call to register_kprobe().
> 2. The kprobe infrastructure determines if a probe already exists at
>    the requested location. 
> 3. If this is a new kprobe, registration proceeds as it done
>    currently.
> 4. If a kprobe already exists at the requested address:
>    a. If the call is to register a jprobe, the new handler 
>       registration fails since we can't have a jprobe and a kprobe 
>       at the same address. Whether a registration is for a jprobe
>       or not is determined by the presence of a break_handler.
>    b. If the call is to register a kprobe and if the pre_handler
>       for the existing kprobe is not aggr_pre_handler, we then:
>       	1. Allocate a aggr_probe structure
> 	2. Populate the aggr_probe's kprobe structure with 
> 	   appropriate values for the addr, opcode, ainsn fields
> 	3. Set the handlers for the aggr_probe's kprobe to custom
> 	   handlers with list walk logic
> 	4. Add the existing kprobe to the aggr_probe's kprobes list
> 	5. Initialize the new kprobe
> 	6. Add the new kprobe to the aggr_probe's kprobes list
>    c. If the call is to register a kprobe and if the pre_handler
>       for the existing kprobe is aggr_pre_handler, it means that
>       we already have an aggr_probe defined at this address.
>       Unless the new probe to register is not a jprobe, it is 
>       initialized and added to the aggr_probe's "kprobes" list.
>       The registration fails if the new probe is a jprobe.
>       
> 2.3.2	Unregistering a kprobe
> 
> 1. Kprobes are unregistered by making a call to unregister_kprobe()
> 2. If a kprobe exists and its pre_handler is aggr_pre_handler(),
>    then, this is one of the possibly many kprobes at the address:
>    a. Delete the kprobe from the aggr_probe's kprobes list.
>    b. If the aggr_probe's kprobe list is empty, then we free the
>       aggr_probe structure.
> 3. If the kprobe to be unregistered is the only one at the address,
>    remove it as is done currently.
>  
> 
> 2.4	Handler invocation
> 
> When multiple kprobes are registered at an address, we encapsulate
> the list of kprobes in an aggr_probe structure. struct aggr_probe
> in turn has its own struct kprobe, whose handler routines have in 
> them the logic to walk the kprobes list and invoke the appropriate 
> handlers one at a time.
> 
> This is accomplished by aliasing the aggr_probe's kprobe handlers
> to be:
> 
>         ap->kp.pre_handler = aggr_pre_handler;
>         ap->kp.post_handler = aggr_post_handler;
>         ap->kp.fault_handler = aggr_fault_handler;
> 
> Please refer to the accompanying patch for details about these 
> handlers.
> 
> 
> 2.5	More points on handler invocation
> 
> 2.5.1	Pre handlers
> 
> We care about the return code for a pre_handler _only_ in case 
> of a jprobe. Return code from all other (read kprobe) handlers 
> are ignored. But, it is advised that all pre_handlers return 0 
> for compatibility sake with future enhancements.
> 
> For a multiple kprobes at an address, all pre handlers will be 
> called.
> 
> 2.5.2	Post handlers
> 
> All post handlers will be called.
> 
> 2.5.3	Fault handlers
> 
> Fault handlers are called in sequence until one of them handles
> the fault (returns non zero). Once a fault is handled, no other
> handlers are called.
> 
> NOTE: 
> 
> a. Fault handlers are meant not just to handle faults during
>    the execution of the handlers, but also in cases when we fault
>    while single-stepping out of line. This is the most common case
>    with user-space probes, but, the design has to keep this in 
>    mind so as to accomodate addition of user-space probes at a 
>    later date.
> b. In case we fault while executing a handler, only that kprobe's
>    fault handler should be invoked. This design doesn't yet do
>    that.
> 
> Suggestions are welcome.
> 
> 
> 2.6	Arch specific implementations
> 
> This design does not require any changes to the arch specific 
> kprobe implementations.
> 
> 
> 3	Caveats
> 
> - It is assumed that the handlers written behave well. :-)
> - Multiple kprobes cannot be registered at a location that already
>   has a jprobe registered. In other words, to register a jprobe,
>   no other kprobes must be registered at that address and 
>   vice-versa.
> - The current design builds on the existing kprobe locking 
>   infrastructure. It is hoped that this design (or a variant)
>   can be the base for the scalability changes envisaged. 
>   Suggestions are welcome.
> - We do not (yet) indicate that a kprobe is disabled due to a
>   call to disarm_kprobe().
>  
> 
> 4	Issues
> 
> Fault handling - What would be the sane way of handling a fault?
> 
> 
> 5	Questions
> 
> - Is the design implementable on all architectures kprobes is 
>   currently available?
>   A: Already done! :-)
>      With the new design, no changes are required for the arch
>      specific kprobe implementations.
> 

> 
>  include/linux/kprobes.h |   11 +++
>  kernel/kprobes.c        |  146 ++++++++++++++++++++++++++++++++++++++++++++++--
>  2 files changed, 152 insertions(+), 5 deletions(-)
> 
> diff -Naurp temp/linux-2.6.12-rc1/include/linux/kprobes.h linux-2.6.12-rc1/include/linux/kprobes.h
> --- temp/linux-2.6.12-rc1/include/linux/kprobes.h	2005-03-17 20:34:07.000000000 -0500
> +++ linux-2.6.12-rc1/include/linux/kprobes.h	2005-04-04 14:14:05.000000000 -0400
> @@ -43,6 +43,9 @@ typedef int (*kprobe_fault_handler_t) (s
>  struct kprobe {
>  	struct hlist_node hlist;
>  
> +	/* multiple probe handler support */
> +	struct list_head list;
> +
>  	/* location of the probe point */
>  	kprobe_opcode_t *addr;
>  
> @@ -82,6 +85,14 @@ struct jprobe {
>  	kprobe_opcode_t *entry;	/* probe handling code to jump to */
>  };
>  
> +/*
> + * Special probe type to accomodate multiple handlers per probe
> + */
> +struct aggr_probe {
> +	struct kprobe kp;
> +	struct list_head kprobes;
> +};
> +
>  #ifdef CONFIG_KPROBES
>  /* Locks kprobe: irq must be disabled */
>  void lock_kprobes(void);
> diff -Naurp temp/linux-2.6.12-rc1/kernel/kprobes.c linux-2.6.12-rc1/kernel/kprobes.c
> --- temp/linux-2.6.12-rc1/kernel/kprobes.c	2005-04-04 14:52:24.000000000 -0400
> +++ linux-2.6.12-rc1/kernel/kprobes.c	2005-04-04 20:09:20.000000000 -0400
> @@ -73,18 +73,129 @@ struct kprobe *get_kprobe(void *addr)
>  	return NULL;
>  }
>  
> +int aggr_pre_handler(struct kprobe *p, struct pt_regs *regs)
> +{
> +	struct aggr_probe *ap;
> +	struct kprobe *kp;
> +
> +	ap = container_of(p, struct aggr_probe, kp);
> +	list_for_each_entry(kp, &ap->kprobes, list) {
> +		if (kp->pre_handler)
> +			kp->pre_handler(kp, regs);
> +	}
> +	return 0;
> +}
> +
> +void aggr_post_handler(struct kprobe *p, struct pt_regs *regs, 
> +		unsigned long flags)
> +{
> +	struct aggr_probe *ap;
> +	struct kprobe *kp;
> +
> +	ap = container_of(p, struct aggr_probe, kp);
> +	list_for_each_entry(kp, &ap->kprobes, list) {
> +		if (kp->post_handler)
> +			kp->post_handler(kp, regs, flags);
> +	}
> +	return;
> +}
> +
> +int aggr_fault_handler(struct kprobe *p, struct pt_regs *regs, int trapnr)
> +{
> +	struct aggr_probe *ap;
> +	struct kprobe *kp;
> +
> +	ap = container_of(p, struct aggr_probe, kp);
> +	list_for_each_entry(kp, &ap->kprobes, list) {
> +		/* How do we handle faults??? Dicey! */
> +		if (kp->fault_handler &&
> +			kp->fault_handler(kp, regs, trapnr))
> +			return 1;
> +	}
> +	return 0;
> +}
> +
> +static void copy_kprobe(struct kprobe *src, struct kprobe *dst)
> +{
> +	dst->opcode = src->opcode;
> +	memcpy(&dst->ainsn, &src->ainsn, sizeof(struct arch_specific_insn));
> +}
> +
> +struct aggr_probe *register_aggr_probe(struct kprobe *old_p)
> +{
> +	struct aggr_probe *ap;
> +
> +	ap = kcalloc(1, sizeof(struct aggr_probe), GFP_ATOMIC);
> +	if (!ap)
> +		return NULL;
> +	
> +	ap->kp.addr = old_p->addr;
> +	copy_kprobe(old_p, &ap->kp);
> +
> +	ap->kp.pre_handler = aggr_pre_handler;
> +	ap->kp.post_handler = aggr_post_handler;
> +	ap->kp.fault_handler = aggr_fault_handler;
> +
> +	INIT_LIST_HEAD(&ap->kprobes);
> +	list_add(&old_p->list, &ap->kprobes);
> +	
> +	return ap;
> +}	
> +
>  int register_kprobe(struct kprobe *p)
>  {
>  	int ret = 0;
>  	unsigned long flags = 0;
> +	struct aggr_probe *ap;
> +	struct kprobe *old_p;
>  
>  	if ((ret = arch_prepare_kprobe(p)) != 0) {
>  		goto rm_kprobe;
>  	}
> +	
>  	spin_lock_irqsave(&kprobe_lock, flags);
>  	INIT_HLIST_NODE(&p->hlist);
> -	if (get_kprobe(p->addr)) {
> -		ret = -EEXIST;
> +	old_p = get_kprobe(p->addr);
> +	if (old_p) {
> +		if (old_p->break_handler) {
> +			/* jprobe present, can't register another probe */
> +			ret = -EEXIST;
> +			goto out;
> +		}
> +		if (old_p->pre_handler == aggr_pre_handler) {
> +			/* 
> +			 * Check if registration is for a jprobe
> +			 * Fail if so till we figure out how a jprobe 
> +			 * and a kprobe can coexist at the same addr
> +			 */
> +			if (p->break_handler) {
> +				ret = -EEXIST;
> +				goto out;
> +			}
> +			/* addr is already initialized */
> +			copy_kprobe(old_p, p);
> +			ap = container_of(old_p, struct aggr_probe, kp);
> +			list_add(&p->list, &ap->kprobes);
> +		} else {
> +			/* check for same reason as above */
> +			if (p->break_handler) {
> +				ret = -EEXIST;
> +				goto out;
> +			}
> +			ap = register_aggr_probe(old_p);
> +			if (!ap) {
> +				ret = -ENOMEM;
> +				goto out;
> +			}
> +			/* addr is already initialized */
> +			copy_kprobe(old_p, p);
> +			list_add(&p->list, &ap->kprobes);
> +			INIT_HLIST_NODE(&ap->kp.hlist);
> +			hlist_del(&old_p->hlist);
> +			hlist_add_head(&ap->kp.hlist, 
> +				&kprobe_table[hash_ptr(p->addr, 
> +						KPROBE_HASH_BITS)]);
> +		}
>  		goto out;
>  	}
>  	arch_copy_kprobe(p);
> @@ -99,20 +210,45 @@ int register_kprobe(struct kprobe *p)
>  out:
>  	spin_unlock_irqrestore(&kprobe_lock, flags);
>  rm_kprobe:
> -	if (ret == -EEXIST)
> -		arch_remove_kprobe(p);
>  	return ret;
>  }
>  
>  void unregister_kprobe(struct kprobe *p)
>  {
>  	unsigned long flags;
> -	arch_remove_kprobe(p);
> +	struct kprobe *kp;
> +	struct aggr_probe *ap;
> +	
>  	spin_lock_irqsave(&kprobe_lock, flags);
> +	kp = get_kprobe(p->addr);
> +	if (kp && (kp->pre_handler == aggr_pre_handler)) {
> +
> +		/* this is one of the possibly many kprobes at the address */
> +		ap = container_of(kp, struct aggr_probe, kp);
> +		list_del(&p->list);
> +
> +		/* 
> +		 * if we just unregistered the last probe at the address,
> +		 * its time to cleanup
> +		 */ 
> +		if (list_empty(&ap->kprobes)) {
> +			arch_remove_kprobe(p);
> +			*p->addr = p->opcode;
> +			hlist_del(&ap->kp.hlist);
> +			kfree(ap);
> +			goto flush_cache;
> +		} else
> +			goto out;
> +	}
> +	
> +	/* this is the only kprobe at the address */
> +	arch_remove_kprobe(p);
>  	*p->addr = p->opcode;
>  	hlist_del(&p->hlist);
> +flush_cache:
>  	flush_icache_range((unsigned long) p->addr,
>  			   (unsigned long) p->addr + sizeof(kprobe_opcode_t));
> +out:
>  	spin_unlock_irqrestore(&kprobe_lock, flags);
>  }
>  


-- 
Suparna Bhattacharya (suparna@in.ibm.com)
Linux Technology Center
IBM Software Lab, India


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]