This is the mail archive of the systemtap@sourceware.org mailing list for the systemtap project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

RE: [2/3] Userspace probes prototype-take2


>>-----Original Message-----
>>From: systemtap-owner@sourceware.org [mailto:systemtap-owner@sourceware.org] On Behalf Of Prasanna S Panchamukhi
>>Sent: 2006年2月8日 22:13
>>To: systemtap@sources.redhat.com
>>Subject: Re: [2/3] Userspace probes prototype-take2
>>
>>
>>This patch provides the feature to insert the probes on the pages
>>that are not present in the memory during registeration.
>>
>>To add readpage and readpages() hooks, two new elements
>>are added to uprobe_module object.
>>struct address_space_operations *ori_a_ops;
>>struct address_space_operations user_a_ops;
>>
>>User space probe also allows the probes to be inserted within the
>>pages even that are not present in the memory at the time of
>>registration. This is done by adding hooks to readpage and readpages
>>routines. During registration, the address space operation object is
>>modified by replacing with userspace probes's specific readpage and
>>readpages routines. So that when the pages are readinto the memory
>>through the readpage and readpages address space operations, the
>>probes can be automatically inserted into those pages. These user
>>space readpage and readpages routines internally call the original
>>readpage() and readpages() routines and then check if the probes are
>>to be added to these pages and then insert the probes on these pages.
After I double-checked it, I found a big problem. When readpage/readpages return,
it doesn't mean the page/pages are really already up-to-date. If the caller wants
to get up-to-date pages, it needs call lock_page or wait_on_page_locked to get the page lock.
I/O lower layer is asynchronous with the thread. Only when the device really finishes I/O,
Mostly, bio->bi_end_io is called to unlock the page and change its state to up-to-date.

If we use lock_page or wait_on_page_locked to wait the I/O complete in uprobe_readpage/uprobe_readpages,
It looks like we might break the implied rule that the caller needs call lock_page 
or wait_on_page_locked to wait the page up-to-date.


>>Overhead of adding these hooks are limited to the application on
>>which probes are inserted.
>>
>>While unregiteration, care should be taken to replace the readpage and
>>readpages() hooks by original routines if no probes exists on that
>>application.
>>
>>Signed-of-by: Prasanna S Panchamukhi <prasanna@in.ibm.com>
>>
>> include/linux/namei.h |    1
>>
>>diff -puN kernel/kprobes.c~kprobes_userspace_probes_hook_readpage kernel/kprobes.c
>>
>>diff -puN kernel/kprobes.c~kprobes_userspace_probes_hook_readpage kernel/kprobes.c
>>
>>
>> kernel/kprobes.c |  116 +++++++++++++++++++++++++++++++++++++++++++++++++++++++
>> 1 files changed, 116 insertions(+)
>>
>>diff -puN kernel/kprobes.c~kprobes_userspace_probes_hook_readpage kernel/kprobes.c
>>--- linux-2.6.16-rc1-mm5/kernel/kprobes.c~kprobes_userspace_probes_hook_readpage	2006-02-08 19:21:09.000000000 +0530
>>+++ linux-2.6.16-rc1-mm5-prasanna/kernel/kprobes.c	2006-02-08 19:21:09.000000000 +0530
>>@@ -769,6 +769,110 @@ static struct uprobe_module __kprobes *g
>> 	return NULL;
>> }
>>
>>+static inline void insert_readpage_uprobe(struct page *page,
>>+	struct address_space *mapping, struct uprobe *uprobe)
>>+{
>>+	if (find_page_probe(uprobe->offset >> PAGE_CACHE_SHIFT,
>>+				page->index << PAGE_CACHE_SHIFT)) {
The parameter of function find_page_probe looks incorrect. 
uprobe->offset >> PAGE_CACHE_SHIFT
should be:
uprobe->offset


>>+		map_uprobe_page(page, uprobe, insert_kprobe_user);
>>+		flush_vma(mapping, page, uprobe);
>>+	}
>>+}
>>+
>>+/**
>>+ *  This function hooks the readpages() of all modules that have active probes
>>+ *  on them. The original readpages() is called for the given
>>+ *  inode/address_space to actually read the pages into the memory. Then all
>>+ *  probes that are specified on these pages are inserted.
>>+ */
>>+static int __kprobes uprobe_readpages(struct file *file,
>>+				struct address_space *mapping,
>>+				struct list_head *pages, unsigned nr_pages)
>>+{
>>+	int retval = 0;
>>+	struct page *page;
>>+	struct uprobe_module *umodule;
>>+	struct uprobe *uprobe = NULL;
>>+	struct hlist_node *node;
>>+
>>+	mutex_lock(&uprobe_mutex);
>>+
>>+	umodule = get_module_by_inode(file->f_dentry->d_inode);
>>+	if (!umodule) {
I mentioned a race condition before. If there is no solution, I think at least we need
put the issue on the TODO list. How about using file->f_dentry->d_inode->i_mapping->a_ops->readpages/readpage
when umodule==NULL?


>>+		printk("uprobe_readpages: we don't have a module \
>>+				associated with this file.. Aborting\n");
>>+		retval = -EINVAL;
>>+		goto out;
>>+	}
>>+
>>+	/* call original readpages() */
>>+	retval = umodule->ori_a_ops->readpages(file, mapping, pages, nr_pages);
>>+	if (retval < 0)
>>+		goto out;
>>+
>>+	/*
>>+	 * TODO: Walk through readpages page list and get
>>+	 * pages with probes instead of find_get_page().
>>+	 */
>>+	mutex_lock(&kprobe_mutex);
>>+	hlist_for_each_entry(uprobe, node, &umodule->ulist_head, ulist) {
>>+		page = find_get_page(mapping,
>>+				uprobe->offset >> PAGE_CACHE_SHIFT);
>>+		if (!page)
>>+			continue;
>>+
>>+		if (!uprobe->kp.opcode)
>>+			insert_readpage_uprobe(page, mapping, uprobe);
>>+		page_cache_release(page);
>>+	}
>>+	mutex_unlock(&kprobe_mutex);
>>+
>>+out:
>>+	mutex_unlock(&uprobe_mutex);
>>+	return retval;
>>+}
>>+
>>+/**
>>+ *  This function hooks the readpage() of all modules that have active probes
>>+ *  on them. The original readpage() is called for the given inode/address_space
>>+ *  to actually read the pages into the memory. Then all probes that are
>>+ *  specified on this page are inserted.
>>+ */
>>+int __kprobes uprobe_readpage(struct file *file, struct page *page)
>>+{
>>+	int retval = 0;
>>+	struct uprobe_module *umodule;
>>+	struct uprobe *uprobe = NULL;
>>+	struct hlist_node *node;
>>+	struct address_space *mapping = file->f_dentry->d_inode->i_mapping;
>>+
>>+	mutex_lock(&uprobe_mutex);
>>+	umodule = get_module_by_inode(file->f_dentry->d_inode);
>>+	if (!umodule) {
>>+		printk("uprobe_readpages: we don't have a module \
>>+				associated with this file.. Aborting\n");
>>+		retval = -EINVAL;
>>+		goto out;
>>+	}
>>+
>>+	/* call original readpage() */
>>+	retval = umodule->ori_a_ops->readpage(file, page);
>>+	if (retval < 0)
>>+		goto out;
>>+
>>+	mutex_lock(&kprobe_mutex);
>>+	hlist_for_each_entry(uprobe, node, &umodule->ulist_head, ulist) {
>>+		if (!uprobe->kp.opcode)
>>+			insert_readpage_uprobe(page, mapping, uprobe);
>>+	}
>>+	mutex_unlock(&kprobe_mutex);
>>+
>>+out:
>>+	mutex_unlock(&uprobe_mutex);
>>+
>>+	return retval;
>>+}
>>+
>> /**
>>  * Gets exclusive write access to the given inode to ensure that the file
>>  * on which probes are currently applied does not change. Use the function,
>>@@ -799,9 +903,17 @@ static inline int ex_write_unlock(struct
>> static void __kprobes get_inode_ops(struct uprobe *uprobe,
>> 				   struct uprobe_module *umodule)
>> {
>>+	struct address_space *as;
>>+
>> 	INIT_HLIST_HEAD(&umodule->ulist_head);
>> 	hlist_add_head(&uprobe->ulist, &umodule->ulist_head);
>> 	list_add(&umodule->mlist, &uprobe_module_list);
>>+	as = umodule->nd.dentry->d_inode->i_mapping;
>>+	umodule->ori_a_ops = as->a_ops;
>>+	umodule->user_a_ops = *as->a_ops;
>>+	umodule->user_a_ops.readpage = uprobe_readpage;
>>+	umodule->user_a_ops.readpages = uprobe_readpages;
>>+	as->a_ops = &umodule->user_a_ops;
>> }
>>
>> int __kprobes remove_kprobe_user(struct uprobe *uprobe, unsigned long *address,
>>@@ -811,6 +923,7 @@ int __kprobes remove_kprobe_user(struct
>> 	int cleanup_p;
>>
>> 	p = &uprobe->kp;
>>+	 /*TODO: change it to spin lock, we enter here holding i_mmap_lock */
It could be resolved by moving below mutex_lock(&kprobe_mutex) to unregister_uprobe.


>> 	mutex_lock(&kprobe_mutex);
>> 	old_p = get_kprobe_user(uprobe->inode, uprobe->offset);
>> 	if (unlikely(!old_p)) {
>>@@ -887,6 +1000,9 @@ void __kprobes unregister_uprobe(struct
>> 	hlist_del(&uprobe->ulist);
>> 	if (hlist_empty(&umodule->ulist_head)) {
>> 		list_del(&umodule->mlist);
>>+		umodule->nd.dentry->d_inode->i_mapping->a_ops =
>>+							umodule->ori_a_ops;
>>+
>> 		ex_write_unlock(uprobe->inode);
>> 		path_release(&umodule->nd);
>> 		kfree(umodule);
>>
>>_
>>--
>>Prasanna S Panchamukhi
>>Linux Technology Center
>>India Software Labs, IBM Bangalore
>>Email: prasanna@in.ibm.com
>>Ph: 91-80-51776329


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]