Bug 3542 - request new batch registration/unregistration API
Summary: request new batch registration/unregistration API
Status: RESOLVED FIXED
Alias: None
Product: systemtap
Classification: Unclassified
Component: kprobes (show other bugs)
Version: unspecified
: P2 enhancement
Target Milestone: ---
Assignee: Masami Hiramatsu
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2006-11-19 20:23 UTC by Frank Ch. Eigler
Modified: 2008-04-23 14:19 UTC (History)
7 users (show)

See Also:
Host:
Target:
Build:
Last reconfirmed:


Attachments
kprobe fast unregistration (4.45 KB, patch)
2007-03-13 23:43 UTC, Masami Hiramatsu
Details | Diff
[stap] support fast unregistration (565 bytes, patch)
2007-03-13 23:46 UTC, Masami Hiramatsu
Details | Diff
[kprobe] batch registration take 2 (4.22 KB, application/octet-stream)
2007-05-17 07:24 UTC, Masami Hiramatsu
Details
[test] batch (un)registration test program (2.57 KB, application/octet-stream)
2007-05-17 07:25 UTC, Masami Hiramatsu
Details
[stap] support batch unregistration of kprobes (1.19 KB, patch)
2007-05-17 07:27 UTC, Masami Hiramatsu
Details | Diff
[kprobe] batch registration take 3 (4.84 KB, application/octet-stream)
2007-05-25 12:50 UTC, Masami Hiramatsu
Details
[test] batch (un)registration test program for take3 patchs. (2.61 KB, application/octet-stream)
2007-05-25 12:53 UTC, Masami Hiramatsu
Details
[kprobe] batch registration for 2.6.25-rc2-mm1 (4.88 KB, application/x-gzip)
2008-02-26 19:43 UTC, Masami Hiramatsu
Details
[stap] support batch unregistration of kprobes take 2 (1.09 KB, patch)
2008-02-26 19:46 UTC, Masami Hiramatsu
Details | Diff
[test] batch (un)registration test program for new patchs. (2.27 KB, application/x-gzip)
2008-02-26 19:47 UTC, Masami Hiramatsu
Details
[kprobe] batch registration for 2.6.25-rc3-mm1 (4.57 KB, application/x-gzip)
2008-03-08 00:25 UTC, Masami Hiramatsu
Details
[kprobe] batch registration for 2.6.25-rc5-mm1 (4.57 KB, application/x-gzip)
2008-03-11 20:59 UTC, Masami Hiramatsu
Details
[test] batch (un)registration test program for new patchs (for *register_*probes) (2.34 KB, application/x-gzip)
2008-03-11 21:06 UTC, Masami Hiramatsu
Details
[stap] support batch unregistration of kprobes take 3 (1.08 KB, patch)
2008-03-11 21:08 UTC, Masami Hiramatsu
Details | Diff
Fix testtool to run on powerpc also (846 bytes, patch)
2008-03-12 13:06 UTC, Ananth Mavinakayanahalli
Details | Diff
Fix testtool to run on powerpc also (846 bytes, patch)
2008-03-12 13:06 UTC, Ananth Mavinakayanahalli
Details | Diff
[kprobe] batch registration for 2.6.25-rc5-mm1 (fixed) (8.19 KB, application/x-gzip)
2008-03-12 15:31 UTC, Masami Hiramatsu
Details
[test-kprobe] add batch registration testcases to test_kprobes.c (1.95 KB, application/x-gzip)
2008-03-21 23:34 UTC, Masami Hiramatsu
Details
[test-kprobe] add batch registration testcases to test_kprobes.c take2 (1.81 KB, application/x-gzip)
2008-03-27 00:59 UTC, Masami Hiramatsu
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Frank Ch. Eigler 2006-11-19 20:23:32 UTC
Unregistering many thousands of kprobes can take several minutes.
(Registration seems to be quick.)

This could be because of extensive synchronization for each 
inserted/removed probe.  Please investigate whether such
synchronization could be done once for a batch of many kprobes.
Comment 1 Frank Ch. Eigler 2006-11-19 20:39:53 UTC
Specifically, it seems to be the rcu synchronization after every unregistration
that takes so long when multiplied by thousands.
Comment 2 Jim Keniston 2007-03-01 22:49:47 UTC
Assigning to Masami at Satoshi's request.
Comment 3 Masami Hiramatsu 2007-03-13 23:43:05 UTC
Created attachment 1612 [details]
kprobe fast unregistration

Here is a series of patches which introduce two kinds of interfaces
for speeding up unregistering process of kprobes.

Currently, the unregister_*probe() function takes a long time to 
unregister specified probe because it waits the rcu synchronization 
after each unregistration. So we need to introduce new method for
unregistering a lot of probes at once.

I'd like to suggest introducing following two interfaces for this issue.
The first interface is unregister_*probe_fast(). This removes 
breakpoint instruction from kernel code and adds specified probe 
to the unregistering list.
The second interface is commit_kprobes(). This waits the rcu 
synchronization and clean up each probe on the unregistering list.
This patch also adds a list_head member to the kprobe data structure 
for linking to the unregistering list.

If using these interfaces, the probe handlers of unregistering probes 
might be called after unregister_*probe_fast() is called. So, you MUST
call commit_kprobes() after calling unregisnter_*probe_fast() for all 
probes.

It is safe even if several threads unregister probes simultaneously, 
because the unregistration process is protected by kprobe_lock mutex.
If one thread calls commit_kprobes(), it will cleanup not only its probes 
but also the other thread's probes. And it is transparently done.
If there is no unregistering probes, commit_kprobes() do nothing.

Here is an example code.
--
struct kprobes *p;
for_each_probe(p) {
	unregister_kprobe_fast(p);
}
commit_kprobes();
--

Thanks,
Comment 4 Masami Hiramatsu 2007-03-13 23:46:08 UTC
Created attachment 1613 [details]
[stap] support fast unregistration

This patch add support fast unregistration to stap.

Here, I show how the waiting time is reduced by using this feature.

This is the test script.
--fastunreg.stp--
probe syscall.*, syscall.*.return { printf("test probe\n"); }
probe begin {exit();}
----

Without this patch, stap has to wait more than 7 seconds for 
unregistration all probes.

$ stap -v fastunreg.stp -o /dev/null
Pass 1: parsed user script and 54 library script(s) in 760usr/20sys/806real ms.

Pass 2: analyzed script: 579 probe(s), 1 function(s), 3 embed(s), 0 global(s)
in 2040usr/70sys/2174real ms.
Pass 3: translated to C into
"/tmp/stapwW3jDW/stap_dc844f44127e7d8537f8aebb8b404d5b_76493.c" in
20usr/0sys/33real ms.
Pass 4: compiled C into "stap_dc844f44127e7d8537f8aebb8b404d5b_76493.ko" in
2100usr/160sys/2323real ms.
Pass 5: starting run.
WARNING: Number of errors: 0, skipped probes: 32
Pass 5: run completed in 20usr/50sys/7169real ms.

With this patch, stap wait less than 0.2 seconds for unregistration 
all probes.

$ stap -v fastunreg.stp -o /dev/null
Pass 1: parsed user script and 54 library script(s) in 930usr/0sys/964real ms.
Pass 2: analyzed script: 579 probe(s), 1 function(s), 3 embed(s), 0 global(s)
in 1630usr/60sys/1737real ms.
Pass 3: translated to C into
"/tmp/stapPZkh0h/stap_e1a907cd3ec467ddb8d95221c1e4aae1_76493.c" in
20usr/0sys/33real ms.
Pass 4: compiled C into "stap_e1a907cd3ec467ddb8d95221c1e4aae1_76493.ko" in
2080usr/200sys/2310real ms.
Pass 5: starting run.
Pass 5: run completed in 20usr/40sys/145real ms.
Comment 5 Frank Ch. Eigler 2007-03-15 19:11:42 UTC
Thank you, looks really good.

Someone needs to send it upstream, at which point we can ask it be included in
RHEL/Fedora/etc.

From the point of view of the translator, it would be ideal if it could emit an
#ifdef/#else/#endif to generate both _fast and non-_fast kinds of unregistration
calls.  In other words, it would be helpful if the kernel patch added a #define
ARCH_KPROBES_UNREGISTER_FAST or somesuch, so the translator could work
transparently with unpatched and patched kernels.
Comment 6 Masami Hiramatsu 2007-03-23 14:15:26 UTC
(In reply to comment #5)
> Thank you, looks really good.
> 
> Someone needs to send it upstream, at which point we can ask it be included in
> RHEL/Fedora/etc.

OK, I'll do that.

> From the point of view of the translator, it would be ideal if it could emit an
> #ifdef/#else/#endif to generate both _fast and non-_fast kinds of unregistration
> calls.  In other words, it would be helpful if the kernel patch added a #define
> ARCH_KPROBES_UNREGISTER_FAST or somesuch, so the translator could work
> transparently with unpatched and patched kernels.

OK. I'll propose some kinds of new flag.
Kprobes have never had such kind of flags. So we should introduce
such flags or macros to not only for this but every interface changes.
Comment 7 Frank Ch. Eigler 2007-03-23 14:18:49 UTC
(In reply to comment #6)
> OK. I'll propose some kinds of new flag.

Actually, with the new autoconf hack in the module makefile, this is no longer
necessary.
Comment 8 Ananth Mavinakayanahalli 2007-03-30 06:36:03 UTC
I am not very comfortable with the current approach (yet). I guess the request
is to add a batch unregister facility.

- Would that mean, unregister *all* currently install kprobes?
- If not, we'll need to specify a subset. If it is indeed a subset, I like
Christoph's idea of a NULL terminated array supplied to the unregister functin.

Also, from an RCU perspective, "batching" unregistration of multiple kprobes is
a simple matter. You can chain multiple unregister requests and then do the
necessary stuff (put back original instruction, other housekeeping) for all the
kprobes and *then* call synchronize_sched(). Thus, you'll have just one grace
period for all of the probes and RCU will do the right thing. I see this being
done, but its more complex than required.

I agree that unregistration needs speedup, but it shouldn't be at the cost of
usability in the normal case - if we can preserve the current API and/or
simplicity, I am all for it. I wouldn't want users to make multiple calls to
unregister and assume that they'll do the right thing.

Thanks,
Ananth
Comment 9 Frank Ch. Eigler 2007-03-30 13:59:09 UTC
> - Would that mean, unregister *all* currently install kprobes?

In systemtap's case, we'd probably use it to unregister all probes on a
per-module or per-kernel basis.

> - If not, we'll need to specify a subset. If it is indeed a subset, I like
> Christoph's idea of a NULL terminated array supplied to the unregister functin.

Yes, that is a natural option, but it requires the client to allocate and
fill in an array just for purposes of shutdown.  (We don't already have arrays
of naked struct kprobes.)
Comment 10 Masami Hiramatsu 2007-04-02 10:32:45 UTC
(In reply to comment #8)
> I am not very comfortable with the current approach (yet). I guess the request
> is to add a batch unregister facility.
> 
> - Would that mean, unregister *all* currently install kprobes?
> - If not, we'll need to specify a subset. If it is indeed a subset, I like
> Christoph's idea of a NULL terminated array supplied to the unregister functin.

Basically, I agree with Frank's concern.
I think it costs high, because we have to allocate/manage another array and 
set probes which we want to unregister. Thus, the API itself becomes simple,
but the code of the user side will become a bit complex.
In contrast, my preparing/commit style interface just requires changing 
function name and  calling an additional function.

> Also, from an RCU perspective, "batching" unregistration of multiple kprobes is
> a simple matter. You can chain multiple unregister requests and then do the
> necessary stuff (put back original instruction, other housekeeping) for all the
> kprobes and *then* call synchronize_sched(). Thus, you'll have just one grace
> period for all of the probes and RCU will do the right thing. I see this being
> done, but its more complex than required.
> 
> I agree that unregistration needs speedup, but it shouldn't be at the cost of
> usability in the normal case - if we can preserve the current API and/or
> simplicity, I am all for it. I wouldn't want users to make multiple calls to
> unregister and assume that they'll do the right thing.

Sure, IMHO, we'll not get rid of the current API, and user can use 
either it or unregister_fast style APIs. In this way, only if the user 
want to speedup unregistration of hundreds of probes, he can choose 
the new APIs.
Comment 11 Ananth Mavinakayanahalli 2007-04-02 10:47:49 UTC
But, if you were to unregister on a per-module basis, would it be too much to
ask that all kprobes be part of an array? Surely, systemtap can possibly be able
to transparently allocate all probes of a module from an array.

From a normal user perspective, I don't see this as a big problem. Normal
(non-systemtap) kprobe users would generally have, one, or at most a few kprobes
in one module.

Unless there is *no other way* to implement this feature with the current API
(or maybe *one* new API, say, kprobes_unregister_batch() or somesuch), I would
be very reluctant to go with any interface that requires multiple calls.
Comment 12 Ananth Mavinakayanahalli 2007-04-03 06:57:03 UTC
How about this for an idea for bulk unregistration (on a per-module basis):

- struct kprobe will have a new element - struct list_head chain;
- Modules can be made to list_add() every new kprobe registered.
- Modules also need to have a reference to the list_head
- We provide one new API unregister_kprobes_list() or somesuch where in the
unregister routine, list_for_each_entry() is used to iterate through the kprobes
chain and unregistered
- Call synchronize_sched() once this list is processed.

Does this sound ok? Masami?
Comment 13 Frank Ch. Eigler 2007-04-03 13:19:42 UTC
(In reply to comment #12)
> How about this for an idea for bulk unregistration (on a per-module basis):
> 
> - struct kprobe will have a new element - struct list_head chain;
> - Modules can be made to list_add() every new kprobe registered.
> - Modules also need to have a reference to the list_head
> [...]

This looks simple to the kernel, but again it requires the client to do more
pointer manipulation.  It's possible, but it doesn't make the process *overall*
simpler nor more robust.
Comment 14 Masami Hiramatsu 2007-04-03 16:57:50 UTC
(In reply to comment #12)
> How about this for an idea for bulk unregistration (on a per-module basis):
> 
> - struct kprobe will have a new element - struct list_head chain;
> - Modules can be made to list_add() every new kprobe registered.
> - Modules also need to have a reference to the list_head
> - We provide one new API unregister_kprobes_list() or somesuch where in the
> unregister routine, list_for_each_entry() is used to iterate through the kprobes
> chain and unregistered
> - Call synchronize_sched() once this list is processed.
> 
> Does this sound ok? Masami?

It seems good to me.
Frank, I think we can make user's code more simple if 
we provide additional API register_kprobes_list().
Here is an example(usage) code which I think.
---
struct list_head probe_list;
struct kprobe *kp;

INIT_LIST_HEAD(&probe_list);
for (i = 0; i < NR_PROBES; i++) { 
	kp = kmalloc(sizeof(struct kprobe), GFP_KERNEL);
	INIT_LIST_HEAD(&kp->chain);
	list_add(&probe_list, &kp->chain);
}
...
register_kprobes_list(&probe_list);
...
unregister_kprobes_list(&probe_list);
---
What do you think about this?

Comment 15 Frank Ch. Eigler 2007-04-06 17:52:32 UTC
(In reply to comment #14)
> (In reply to comment #12)
> Frank, I think we can make user's code more simple if 
> we provide additional API register_kprobes_list().
> [...]

That is, make the kprobe client maintain a linked list for both registration
and unregistration.  This seems like making things more rather than less
complicated.  We already minimize dynamic memory allocation in favour of
static (bss/data) structures.  We like to iterate over larger structures 
that contain kprobes.  We like to informally bunch kprobes on a per-module
basis since our compiled probes tolerate the absence or presence of probed
modules.  This will be even more useful when we allow module init/exit
code to be probed, and add/remove kprobes on-the-fly during module
loading and unloading.  At the same time, I don't want to have to thread
multiple linked lists.  While of course systemtap could be made to emit such
code, this is not making *its* job easier, and it certainly isn't making it
easier for the dinosaurs writing kprobes C code by hand. :-)

I hope we can think of another way to defer the slow system-wide
synchronization.  That was the only problem with the API.

How about a new pair of functions to bracket the existing unregister calls:
like a begin_ & end_registration_group?  Then, with a begin_* active,
registration/unregistration synch would be deferred until the end_* is called.

Or, how about implementing unregistration in a deferred way, but which allows
the client's struct kprobe memory to be reused or freed immediately?
Comment 16 Ananth Mavinakayanahalli 2007-04-09 05:40:51 UTC
Subject: Re:  request new batch registration/unregistration API

On Tue, Apr 03, 2007 at 12:19:42PM -0000, fche at redhat dot com wrote:
> 
> ------- Additional Comments From fche at redhat dot com  2007-04-03 13:19 -------
> (In reply to comment #12)
> > How about this for an idea for bulk unregistration (on a per-module basis):
> > 
> > - struct kprobe will have a new element - struct list_head chain;
> > - Modules can be made to list_add() every new kprobe registered.
> > - Modules also need to have a reference to the list_head
> > [...]
> 
> This looks simple to the kernel, but again it requires the client to do more
> pointer manipulation.  It's possible, but it doesn't make the process *overall*
> simpler nor more robust.

In practice, the client doesn't have to do any more than
list_add_tail_rcu(&new_kp, &list_head) while adding a probe and
list_del_rcu(&unreg_kp).

I don't see how it can be made more simpler than this.

Any reason why you think this process is less robust?
Comment 17 Frank Ch. Eigler 2007-05-08 19:52:14 UTC
Is there any upstreaming news on this patch?
Comment 18 Ananth Mavinakayanahalli 2007-05-09 05:22:18 UTC
Masami,

Any updates?

Aside, see what is done with global disable/enable of kprobes. All gets done at
one shot with just one synchronize_sched(). This is similar to the scheme I
suggested - while the global scheme works on the global list, we'll need a
module local list for the batch case.

Ananth
Comment 19 Masami Hiramatsu 2007-05-09 14:01:20 UTC
Sorry for not writing you soon.

There have been many suggestions. And now, I think we have two options.
(A) register_/unregister_kprobe_list()
(B) begin_/end_registration_group()
(Both methods need to introduce a list member to the kprobe data structure.)

I ported my orignal patch and developed (B) idea on 2.6.21-mm1.
And currently, I'm trying to developed (A) idea. 

So, which idea is more useful for you?

In my opinion, (A) seems more general way, and (B) is easier to implement
because it is similer to my original implementation...

Thanks,
Comment 20 Ananth Mavinakayanahalli 2007-05-10 05:03:26 UTC
Subject: Re:  request new batch registration/unregistration API

On Wed, May 09, 2007 at 01:01:22PM -0000, masami dot hiramatsu dot pt at hitachi dot com wrote:
> 
> ------- Additional Comments From masami dot hiramatsu dot pt at hitachi dot com  2007-05-09 14:01 -------
> Sorry for not writing you soon.
> 
> There have been many suggestions. And now, I think we have two options.
> (A) register_/unregister_kprobe_list()
> (B) begin_/end_registration_group()
> (Both methods need to introduce a list member to the kprobe data structure.)
> 
> I ported my orignal patch and developed (B) idea on 2.6.21-mm1.
> And currently, I'm trying to developed (A) idea. 
> 
> So, which idea is more useful for you?
> 
> In my opinion, (A) seems more general way, and (B) is easier to implement
> because it is similer to my original implementation...

I very much prefer (A). (B) (I feel) is an overengineered solution to a
problem that can be solved more simply by (A).

Ananth
Comment 21 Masami Hiramatsu 2007-05-10 12:56:52 UTC
(In reply to comment #20)
> > There have been many suggestions. And now, I think we have two options.
> > (A) register_/unregister_kprobe_list()
> > (B) begin_/end_registration_group()
> > (Both methods need to introduce a list member to the kprobe data structure.)
> > 
> > I ported my orignal patch and developed (B) idea on 2.6.21-mm1.
> > And currently, I'm trying to developed (A) idea. 
> > 
> > So, which idea is more useful for you?
> > 
> > In my opinion, (A) seems more general way, and (B) is easier to implement
> > because it is similer to my original implementation...
> 
> I very much prefer (A). (B) (I feel) is an overengineered solution to a
> problem that can be solved more simply by (A).

OK, I'm continuing to develop the patches which provides
register_/unregister_kprobe_list().
After debugging it, I'll post it as soon as possible.

Thanks,
Comment 22 Frank Ch. Eigler 2007-05-15 19:36:44 UTC
Another way may be the idea from this posting re. markers:
http://lkml.org/lkml/2007/5/10/475

That is: defer synchronization until the last probe in a given module is
unregistered.
Comment 23 Ananth Mavinakayanahalli 2007-05-16 06:51:11 UTC
Yes, that is possible if you ensure that all of the probes registered by the
said module are going to be unregistered at the same time. The RCU semantics
won't allow for postponing synchronize_sched() to later, if that is not the
case. In such a deferred/intermittant unregistration, results are unpredictable.

The one advantage of the list based approach is that it'll handle both register
and unregister with just one call, in contrast to 'n' calls in the refcounted
solution. The refcount solution doesn't handle the registration case (not that
it is a bottleneck).
Comment 24 Masami Hiramatsu 2007-05-17 07:24:10 UTC
Created attachment 1845 [details]
[kprobe] batch registration take 2

Here is a set of patches which introduce new list interfaces
for speeding up registration/unregistration process of kprobes.

Currently, the unregister_*probe() function takes a long time to
unregister specified probe because it waits the rcu synchronization
after each unregistration. So we need to introduce new method for
unregistering a lot of probes at a time.

So, these patches introduce batch registration/unregistration interfaces.
It also introduce a "chain" list_head structure as a new member of 
the kprobe structure. 

To use these interfaces, at first, add each "chain" member of all probes 
which you want to register/unregister at a time to a list_head structure.
And, invoke *register_*probes_list() to register/unregister all probes 
on the list.

Note that you MUST add only same type of probes(kprobe, kretprobe, 
and jprobe) into a list, because these registration/unregistration 
functions can't identify the type of probes on the list automatically.

If one of probes on a list gets an error, batch registration functions
unregister all registered probes on the list. And it returns
error code.

Here is an example code.
---
struct list_head probe_list;
struct kprobe *kp;

INIT_LIST_HEAD(&probe_list);
for (i = 0; i < NR_PROBES; i++) { 
	kp = kzalloc(sizeof(struct kprobe), GFP_KERNEL);
	INIT_LIST_HEAD(&kp->chain);
	list_add(&probe_list, &kp->chain);
}
...
register_kprobes_list(&probe_list);
...
unregister_kprobes_list(&probe_list);
---

Thanks,
Comment 25 Masami Hiramatsu 2007-05-17 07:25:50 UTC
Created attachment 1846 [details]
[test] batch (un)registration test program

Here is a test program for the batch registration interface.
You can check that interfaces work correctly as below.

$ make
$ sudo ./batch_test.sh
Comment 26 Masami Hiramatsu 2007-05-17 07:27:12 UTC
Created attachment 1847 [details]
[stap] support batch unregistration of kprobes


This patch add new batch unregistration support feature to systemtap.
This also includes autoconf support.
Comment 27 Ananth Mavinakayanahalli 2007-05-17 07:58:46 UTC
Very quick eyeballing over the kernel patches:

- __unregister_kprobes_list_part is calling synchronize_sched() with
kprobe_mutex held. That is incorrect.
- Some codingstyle issues:
   - if (num > 0 && ++count >= num) break; -> needs to be broken into two lines
here and other places
   - In newer routines added, please leave a blank line between local
declarations and code
   - +               if (p->post_handler){ -> need a space between ) and { here
and elsewhere

- Can some code reordering be done to eliminate forward declarations?

Anil, can you please do a review?
Comment 28 Masami Hiramatsu 2007-05-17 11:28:26 UTC
(In reply to comment #27)
> Very quick eyeballing over the kernel patches:

Thank you for reviewing.

> - __unregister_kprobes_list_part is calling synchronize_sched() with
> kprobe_mutex held. That is incorrect.
> - Some codingstyle issues:
>    - if (num > 0 && ++count >= num) break; -> needs to be broken into two lines
> here and other places
>    - In newer routines added, please leave a blank line between local
> declarations and code
>    - +               if (p->post_handler){ -> need a space between ) and { here
> and elsewhere
> 
> - Can some code reordering be done to eliminate forward declarations?

I fixed these issues except for is_kretprobe() forward declaration, 
because it depends on the pre_handler_kretprobe().
I think the pre_handler_kretprobe() can be moved just below the 
cleanup_rp_inst(). What would you think about it?

Comment 29 Frank Ch. Eigler 2007-05-17 11:46:23 UTC
One additional consideration may be a lost cause, but I'll still mention it:
by changing struct kprobe, it becomes difficult or impossible to backport
this work into an ABI-frozen release stream like RHEL.
Comment 30 Frank Ch. Eigler 2007-05-17 22:11:18 UTC
(In reply to comment #26)
> Created an attachment (id=1847)
> [stap] support batch unregistration of kprobes
> This patch add new batch unregistration support feature to systemtap.
> This also includes autoconf support.

Nicely done.
One issue: the linked list chaining may need to consider the rc
from the k[ret]probe_register call.  Do we want failed probes
on that list?
Comment 31 Frank Ch. Eigler 2007-05-17 22:15:09 UTC
(In reply to comment #29)
> One additional consideration may be a lost cause, but I'll still mention it:
> by changing struct kprobe, it becomes difficult or impossible to backport
> this work into an ABI-frozen release stream like RHEL.

Perhaps it's worth comparing this linked-list method to an ABI-compatible one
where the new k[ret]_unregister_list functions take an array of k[ret]probe
pointers.  Then the k[ret]probe structs don't need to be changed, and thus
backporting would be easy.  It would also simplify the translator changes.

(I argued against this approach earlier, hoping to make the kprobes layer
do all this automatically.  But if it can't, then at least we can try to
make it as pure an extension as possible.)
Comment 32 Ananth Mavinakayanahalli 2007-05-18 07:34:24 UTC
The refcount approach is good if all probe unregister calls happen in one go.
Sadly, Mathieu's refcount patch is marker specific and will need work to make it
generic (I think that can be done - make the refcount agnostic to
instrumentation - static or dynamic). Also, the patch is contained in module.c -
which means its only useful for probe "modules" - again, not that big a deal,
'cos we don't (currently) have non-modular probe users.

Yes, it'd be interesting to see if we can handle all of this at the module
level, without much duplication between markers and kprobes.
Comment 33 Masami Hiramatsu 2007-05-18 08:47:59 UTC
(In reply to comment #31)
> (In reply to comment #29)
> > One additional consideration may be a lost cause, but I'll still mention it:
> > by changing struct kprobe, it becomes difficult or impossible to backport
> > this work into an ABI-frozen release stream like RHEL.
> 
> Perhaps it's worth comparing this linked-list method to an ABI-compatible one
> where the new k[ret]_unregister_list functions take an array of k[ret]probe
> pointers.  Then the k[ret]probe structs don't need to be changed, and thus
> backporting would be easy.  It would also simplify the translator changes.

Good idea. I think that it is reasonable to use a pointer array of *probes. 

Since we can't identify the type of probes on the list before registering it, 
we have to make lists for each types of probes. If we take symmetry of API into 
consideration, we may not merge those list.
So, we don't need flexibility of those list. I mean, we may not add additional 
probes to this list, and may not delete some probes from it.
Comment 34 Masami Hiramatsu 2007-05-18 08:48:48 UTC
(In reply to comment #30)
> One issue: the linked list chaining may need to consider the rc
> from the k[ret]probe_register call.  Do we want failed probes
> on that list?

Thanks, I'll fix that.
Comment 35 Masami Hiramatsu 2007-05-25 12:50:56 UTC
Created attachment 1860 [details]
[kprobe] batch registration take 3

Hi Frank and Ananth,
 
I modified my patches to suit to Frank's suggestion and attached
here. 

However, I think it might become a bit complex for this purpose.
In my opinion, we might as well apply this patch only to the 
ABI-frozen kernels, and push the patches based on the take2 to
the upstream. What would you think about this idea?

Descriptions:
Here is a set of patches which introduce new list interfaces
for speeding up registration/unregistration process of kprobes.

Currently, the unregister_*probe() function takes a long time to
unregister specified probe because it waits the rcu synchronization
after each unregistration. So we need to introduce new method for
unregistering a lot of probes at a time.

So, these patches introduce batch registration/unregistration interfaces.
It also introduce "probe_container" data structure for listing up the 
probes which will be registered/unregistered.

To use these interfaces, at first, allocate probe_container structures for
each probes, and assign the address of the kprobe structure to the 

probe_container.kp. After that, add those probe_containers to a list_head
structure. Finally, invoke *register_*probes_list() to register/unregister 

all probes on the list.

Note that you MUST add only same type of probes(kprobe, kretprobe, 
and jprobe) into a list, because these registration/unregistration 
functions can't identify the type of probes on the list automatically.

If one of probes on a list gets an error, batch registration functions
unregister all registered probes on the list. And it returns
error code.

Here is an example code.
---
struct list_head probe_list;
struct probe_container *pc;

INIT_LIST_HEAD(&probe_list);
for (i = 0; i < NR_PROBES; i++) { 
	pc = kzalloc(sizeof(struct probe_container), GFP_KERNEL);
	pc->kp = kzalloc(sizeof(struct kprobe), GFP_KERNEL);
	INIT_LIST_HEAD(&pc->list);
	list_add(&probe_list, &pc->list);
}
...
register_kprobes_list(&probe_list);
...
unregister_kprobes_list(&probe_list);
---

Thanks,
Comment 36 Masami Hiramatsu 2007-05-25 12:53:31 UTC
Created attachment 1861 [details]
[test] batch (un)registration test program for take3 patchs.
Comment 37 Frank Ch. Eigler 2007-05-25 14:52:44 UTC
Thank you very much for your ongoing efforts on this.

I am sorry that I was not completely clear in what I was
proposing for the ABI-frozen case, which is the same thing
someone else on lkml said:

void unregister_kprobe_list (struct kprobe** array, size_t count);

That does not require any extra structures or lists, and does
not need to change how registration is done.  What do you think?
Comment 38 Masami Hiramatsu 2007-05-30 14:46:00 UTC
Thank you for your advice.

(In reply to comment #37)
> Thank you very much for your ongoing efforts on this.
> 
> I am sorry that I was not completely clear in what I was
> proposing for the ABI-frozen case, which is the same thing
> someone else on lkml said:
> 
> void unregister_kprobe_list (struct kprobe** array, size_t count);
> 
> That does not require any extra structures or lists, and does
> not need to change how registration is done.  What do you think?
> 

Would you mean introducing only unregister_kprobes_list() for batch
unregistration, and no need to introduce register_*probes_list() for
batch registration?

I think that is one idea. Since the type of probes could be
identified when we unregistering probes, we can integrate all probes
in an array.

However, from the viewpoint of symmetry of interfaces, I think 
we should introduce registration interfaces. What would you think?
Comment 39 Frank Ch. Eigler 2007-05-30 14:52:51 UTC
> Would you mean introducing only unregister_kprobes_list() for batch
> unregistration, and no need to introduce register_*probes_list() for
> batch registration?

I know of no performance requirement for a batch registration call,
but of course symmetry suggests that there be one.  It would have to
be carefully written to handle mid-array registration errors.  If it
existed, systemtap could use it, though it wouldn't accomplish much.
Comment 40 Frank Ch. Eigler 2007-10-02 20:57:57 UTC
Is there any progress on posting at least the batch-unregistration API on LKML?
Comment 41 Masami Hiramatsu 2008-02-26 19:43:56 UTC
Created attachment 2291 [details]
[kprobe] batch registration for 2.6.25-rc2-mm1

I updated a series of patches against for 2.6.25-rc2-mm1.
This patchset includes a bugfix of register_kretprobe.
Comment 42 Masami Hiramatsu 2008-02-26 19:46:30 UTC
Created attachment 2292 [details]
[stap] support batch unregistration of kprobes take 2

This patch adds batch unregistration support to systemtap.
Comment 43 Masami Hiramatsu 2008-02-26 19:47:52 UTC
Created attachment 2293 [details]
[test] batch (un)registration test program for new patchs.

Test tools for new batch registration/unregistration interface.
Comment 44 Frank Ch. Eigler 2008-02-26 19:55:14 UTC
Looks great (including the systemtap patch).
Please feel free to commit to systemtap as soon as upstream
kernel accepts the API.
Comment 45 Ananth Mavinakayanahalli 2008-02-27 07:16:47 UTC
Thanks Masami for pursuing this.

Overall, the changes look good. Some observations:
a. Andrew has taken in my patch to introduce CONFIG_KRETPROBES; your patches may
have to be updated to accomodate that (mostly cosmetic change)
b. I guess its good to indicate in the doc *how* one can use this interface -
probably indicate the array allocation and usage.
c. The first patch in this series can go upstream independent of the bulk
registration set.
Comment 46 Srinivasa DS 2008-02-27 13:49:46 UTC
Masami 
 I successfully tested your patch on x86_64, But Iam not able to run your
testcase on powerpc systems, as registering of probe fails. 
==================================================
root@llm27lp1 batchtest-20080226]# ./batch_test.sh
insmod: error inserting './batch_test.ko': -1 Operation not permitted
test failed (case 1)
<<test case 1>>
UNEXPECTED: probe install error: -22
===================================================
Comment 47 Masami Hiramatsu 2008-02-28 03:55:30 UTC
(In reply to comment #45)
> Thanks Masami for pursuing this.
> 
> Overall, the changes look good. Some observations:
> a. Andrew has taken in my patch to introduce CONFIG_KRETPROBES; your patches may
> have to be updated to accomodate that (mostly cosmetic change)

OK, but could you tell me how can I get the latest -mm tree which includes your
patches?

> b. I guess its good to indicate in the doc *how* one can use this interface -
> probably indicate the array allocation and usage.

Sure, I'll add it.

> c. The first patch in this series can go upstream independent of the bulk
> registration set.

OK, thanks.
Comment 48 Masami Hiramatsu 2008-02-28 04:05:30 UTC
(In reply to comment #46)
> Masami 
>  I successfully tested your patch on x86_64, But Iam not able to run your
> testcase on powerpc systems, as registering of probe fails. 
> ==================================================
> root@llm27lp1 batchtest-20080226]# ./batch_test.sh
> insmod: error inserting './batch_test.ko': -1 Operation not permitted
> test failed (case 1)
> <<test case 1>>
> UNEXPECTED: probe install error: -22
> ===================================================

Unfortunately, I don't have the powerpc system. But the test case #1 just use 
register_kprobe(), so I think it comes from some architecture differences.
For example, this testtool might not work on ia64, because kp.addr should 
be initialized by *(void **)func on ia64, instead of (void*)func.
I might better use symbol_name field instead of addr field...
Comment 49 Srinivasa DS 2008-02-28 15:18:03 UTC
Masami
 I built stap from systemtap patch attached here and then executed the systemtap
tests on powerpc system without any problem. 
  
Thanks
 Srinivasa Ds
Comment 50 Jim Keniston 2008-02-28 23:20:14 UTC
Here's a review.  A lot of these comments are nits, but I do have
some concerns around the RCU stuff.  See below.

General:
I like this approach.  In particular, arrays of pointers-to-probes
make the most sense to me.  The top/bottom division works nicely.

[un]register_bulk_*probes is kind of an ugly name.  Why not
just [un]register_*probes?

kprobe-fix-kretprobe-nullcheck.patch
===================================
OK.  I like the code cleanup, too.

kprobe-batch_registration.patch
===============================
> +	/* Initialize multiprobe list for __unregister_kprobe_bottom() */
> +	INIT_LIST_HEAD(&p->list);
No need for the comment.  Initializing a data structure shouldn't
require justification.

__unregister_kprobe_bottom:
Your code here propagates what seems to be a bug in the existing code,
related to locking and/or RCU.

Specifically, the existing unregister_kprobe() code calls
list_del_rcu() at one point that is AFTER the call to synchronize_sched()
(and no synchronization call follows) and is also not protected by
kprobe_mutex.  In your code, this is the list_del_rcu() call in
__unregister_kprobe_bottom().

According to my (admittedly intermittent) understanding of RCU, the
proper usage is:
	lock
	list_del_rcu(...)
	unlock
	synchronize
But what I see is
	lock
	...
	unlock
	synchronize
	if (cleanup) {
		...
		list_del_rcu(...)
	}
I think the list_del_rcu() call in question could be changed to a
plain old list_del() because old_p has already been disconnected
from the probe table at that point.  For that same reason, I think
it's OK that kprobe_mutex isn't held.  But the way it's coded,
it looks bad.  In any case, a comment is warranted.

> +		__unregister_kprobe_bottom(kp, list_empty(&kp->list) ||
> +					   (kp->list.next == kp->list.prev));
Once in the existing code and 3 times in the patches, there is this
check to see whether the probe being unregistered is the last probe
connected to an aggregate probe.  It might be clearer and simpler if
you abstract this check into a separate function -- e.g.:
static inline int singleton_list(const struct list_head *head)
{
	return (head->next == head->prev && !list_empty(head));
}
Then the above changes to
		__unregister_kprobe_bottom(kp, list_empty(&kp->list) ||
					   singleton_list(&kp->list));
and
	     p->list.next == &old_p->list && p->list.prev == &old_p->list)) {
changes to
	     singleton_list(&old_p->list)) {

In unregister_kprobe:
> +	if (cleanup >= 0) {
> +		synchronize_sched();
This looks wrong.  __unregister_kprobe_top() can call list_del_rcu()
AND return cleanup==0.  So shouldn't you ALWAYS call synchronize_sched()?

In register_bulk_kprobes:
> +	for (i = 0; i < num; i++) {
> +		ret = __register_kprobe(kprobes[i],
> +				(unsigned long)__builtin_return_address(0));
I don't know how much __builtin_return_address() costs on the
various architectures, but shouldn't you compute it just once,
outside the loop?

kretprobe-batch_registration.patch
==================================
> +	for (i = 0; i < num; i++) {
> +		ret = __register_kretprobe(rps[i],
> +				(unsigned long)__builtin_return_address(0));
Again, move the __builtin_return_address() call outside the loop.

jprobe-batch_registration.patch
===============================
> +	for (i = 0; i < num; i++) {
> ...
> +		ret = __register_kprobe(&jp->kp,
> +				(unsigned long)__builtin_return_address(0));
Again, move the __builtin_return_address() call outside the loop.

doc-batch_registration.patch
============================
> +There are also register_/unregsiter_bulk_*probes() functions for
s/unregsiter/unregister/

> +batch registration/unregistration of a group of *probes. These
> +functions can speed unregistration process up when you have to
functions can speed up the unregistration process when you have to

Suppose somebody calls
	unregister_bulk_kprobes(kps, 2);
What should happen if kps[1] is registered but kps[0] isn't?  Do you
stop because of the error on kps[0], or keep going?  This would be
a good thing to document.  kprobes.txt currently doesn't make any
promises about unregistering an unregistered probe, but users may
have different expectations regarding bulk unregistration.

> +Inserts specified probes into each specified place. If any error
> +occurs while registering, all registered probes on the array are
> +safely unregistered before returning.
How about this?
Registers each of the num probes in the specified array.  If any
error occurs during registration, all probes in the array, up to
the bad probe, are safely unregistered before the register_bulk_*
function returns.

Also, I think the first paragraph in the API Reference section
should be updated.  The first sentence is:
  The Kprobes API includes a "register" function and an "unregister"
  function for each type of probe.
I'd add the following sentence:
  The API also includes "register_bulk" and "unregister_bulk"
  functions for (un)registering arrays of probes.

batch_test.c:
=============
>  * boost probe bench
>  * Copyright (c) 2006 Hitachi,Ltd.,
Adjust copied-and-pasted comment.

> #define CALC_TIME(title, cmd) {cycles_t c; \
> 	c = get_cycles(); \
> 	cmd; \
> 	c = get_cycles() - c; \
> 	printk(title ":%lld mc\n", c);}
Typical kernel parlance (to eliminate the "};" at the end that results
from a typical invocation) would be:
#define CALC_TIME(title, cmd) do { \
	cycles_t c = get_cycles(); \
	cmd; \
	c = get_cycles() - c; \
	printk(title ":%lld mc\n", c); \
} while (0)

> 	/* one probe inserted in different places without batch function */
> 	printk("<<test case 1>>\n");
Fix comment.  It's just 1 probe in 1 place.


> /*---- two probes inserted in different places ----*/
> 	printk("<<test case 3>>\n");
...
> /*---- two probes inserted in different places ----*/
> 	printk("<<test case 6>>\n");
...
> /*---- two probes inserted in different places ----*/
> 	printk("<<test case 9>>\n");
These look like the same thing.  Maybe change "probes" to "kprobes"
or "kretprobes" or "jprobes" as appropriate.  Ditto elsewhere.

> /*---- two probes inserted in a same place ----*/
> 	printk("<<test case 10>>\n");
Clarify why you expect this to fail: because you can't insert
multiple jprobes at the same address.

BTW, English nit: s/a same place/the same place/
Comment 51 Ananth Mavinakayanahalli 2008-02-29 06:30:24 UTC
Subject: Re:  request new batch
	registration/unregistration API

> OK, but could you tell me how can I get the latest -mm tree which includes your
> patches?

Andrew is yet to release it; you'll probably have to wait for
2.6.25-rc3-mm1
Comment 52 Masami Hiramatsu 2008-03-01 00:46:07 UTC
Hi Jim,

(In reply to comment #50)
> Here's a review.  A lot of these comments are nits, but I do have
> some concerns around the RCU stuff.  See below.

Thank you for detailed comment!

> [un]register_bulk_*probes is kind of an ugly name.  Why not
> just [un]register_*probes?

I think it is not a noticeable change.
Users will be easily mistype and misread it.
If you have another good name, please let me know.

> kprobe-batch_registration.patch
> ===============================
> > +	/* Initialize multiprobe list for __unregister_kprobe_bottom() */
> > +	INIT_LIST_HEAD(&p->list);
> No need for the comment.  Initializing a data structure shouldn't
> require justification.

OK, I'll remove it.

> __unregister_kprobe_bottom:
> Your code here propagates what seems to be a bug in the existing code,
> related to locking and/or RCU.
> 
> Specifically, the existing unregister_kprobe() code calls
> list_del_rcu() at one point that is AFTER the call to synchronize_sched()
> (and no synchronization call follows) and is also not protected by
> kprobe_mutex.  In your code, this is the list_del_rcu() call in
> __unregister_kprobe_bottom().

Thank you for finding it. I'll fix that.

> I think the list_del_rcu() call in question could be changed to a
> plain old list_del() because old_p has already been disconnected
> from the probe table at that point.  For that same reason, I think
> it's OK that kprobe_mutex isn't held.  But the way it's coded,
> it looks bad.  In any case, a comment is warranted.

I'll check whether it is safety.

> > +		__unregister_kprobe_bottom(kp, list_empty(&kp->list) ||
> > +					   (kp->list.next == kp->list.prev));
> Once in the existing code and 3 times in the patches, there is this
> check to see whether the probe being unregistered is the last probe
> connected to an aggregate probe.  It might be clearer and simpler if
> you abstract this check into a separate function -- e.g.:
> static inline int singleton_list(const struct list_head *head)
> {
> 	return (head->next == head->prev && !list_empty(head));
> }

That is a good idea.

> In unregister_kprobe:
> > +	if (cleanup >= 0) {
> > +		synchronize_sched();
> This looks wrong.  __unregister_kprobe_top() can call list_del_rcu()
> AND return cleanup==0.  So shouldn't you ALWAYS call synchronize_sched()?

Please don't worry, I use ">=";-).
when cleanup == 0(or 1), it always calls synchronize_sched().
if cleanup < 0, there is an error in __unregister_kprobe_top().

> In register_bulk_kprobes:
> > +	for (i = 0; i < num; i++) {
> > +		ret = __register_kprobe(kprobes[i],
> > +				(unsigned long)__builtin_return_address(0));
> I don't know how much __builtin_return_address() costs on the
> various architectures, but shouldn't you compute it just once,
> outside the loop?

Indeed. I'll get it outside.

> doc-batch_registration.patch
> ============================
[snip]
> Suppose somebody calls
> 	unregister_bulk_kprobes(kps, 2);
> What should happen if kps[1] is registered but kps[0] isn't?  Do you
> stop because of the error on kps[0], or keep going?  This would be
> a good thing to document.  kprobes.txt currently doesn't make any
> promises about unregistering an unregistered probe, but users may
> have different expectations regarding bulk unregistration.

Actually, that case is not checked, so it will break the kernel.
(NOTE: register_kprobe also does not check it.)

> > +Inserts specified probes into each specified place. If any error
> > +occurs while registering, all registered probes on the array are
> > +safely unregistered before returning.
> How about this?
> Registers each of the num probes in the specified array.  If any
> error occurs during registration, all probes in the array, up to
> the bad probe, are safely unregistered before the register_bulk_*
> function returns.

Thanks, I'll update.

> 
> Also, I think the first paragraph in the API Reference section
> should be updated.  The first sentence is:
>   The Kprobes API includes a "register" function and an "unregister"
>   function for each type of probe.
> I'd add the following sentence:
>   The API also includes "register_bulk" and "unregister_bulk"
>   functions for (un)registering arrays of probes.

Sure.

> batch_test.c:
> =============
> >  * boost probe bench
> >  * Copyright (c) 2006 Hitachi,Ltd.,
> Adjust copied-and-pasted comment.

Oh, OK. I'll fix this test code.

Thank you again,
Comment 53 Frank Ch. Eigler 2008-03-01 00:57:05 UTC
(In reply to comment #52)

> > [un]register_bulk_*probes is kind of an ugly name.  Why not
> > just [un]register_*probes?
> 
> I think it is not a noticeable change.
> Users will be easily mistype and misread it.

But the signatures of the two functions are different,
so it is an easy compile-time error to mix them up.
Comment 54 Masami Hiramatsu 2008-03-03 19:12:47 UTC
(In reply to comment #53)
> (In reply to comment #52)
> 
> > > [un]register_bulk_*probes is kind of an ugly name.  Why not
> > > just [un]register_*probes?
> > 
> > I think it is not a noticeable change.
> > Users will be easily mistype and misread it.
> 
> But the signatures of the two functions are different,
> so it is an easy compile-time error to mix them up.

OK, there are already same kinds of naming
like as alloc_pages/alloc_page in the kernel.

BTW, I think it seems better to inline register_*probe as below;
---
static inline int register_kprobe(struct kprobe *p)
{
       return register_kprobes(&p, 1);
}
static inline void unregister_kprobe(struct kprobe *p)
{
       unregister_kprobes(&p, 1);
}
---

Thanks,
Comment 55 Ananth Mavinakayanahalli 2008-03-04 07:54:29 UTC
Subject: Re:  request new batch
	registration/unregistration API

> (In reply to comment #53)
> > (In reply to comment #52)
> > 
> > > > [un]register_bulk_*probes is kind of an ugly name.  Why not
> > > > just [un]register_*probes?
> > > 
> > > I think it is not a noticeable change.
> > > Users will be easily mistype and misread it.
> > 
> > But the signatures of the two functions are different,
> > so it is an easy compile-time error to mix them up.
> 
> OK, there are already same kinds of naming
> like as alloc_pages/alloc_page in the kernel.
> 
> BTW, I think it seems better to inline register_*probe as below;
> ---
> static inline int register_kprobe(struct kprobe *p)
> {
>        return register_kprobes(&p, 1);
> }
> static inline void unregister_kprobe(struct kprobe *p)
> {
>        unregister_kprobes(&p, 1);
> }

You can't EXPORT_SYMBOL(_GPL) an inline function.. can you?
:-)

Ananth
Comment 56 Masami Hiramatsu 2008-03-04 13:24:18 UTC
(In reply to comment #55)
> > BTW, I think it seems better to inline register_*probe as below;
> > ---
> > static inline int register_kprobe(struct kprobe *p)
> > {
> >        return register_kprobes(&p, 1);
> > }
> > static inline void unregister_kprobe(struct kprobe *p)
> > {
> >        unregister_kprobes(&p, 1);
> > }
> 
> You can't EXPORT_SYMBOL(_GPL) an inline function.. can you?
> :-)

Even though, register_*probes are exported as GPL symbol.
So, you can't use register_*probe from non-GPL module. 

Anyway, it just comes from code maintainability, so we
can deal with it separately(ex. as a cleanup patch etc.).

Thanks,
Comment 57 Ananth Mavinakayanahalli 2008-03-04 15:58:04 UTC
Subject: Re:  request new batch
	registration/unregistration API

> > 
> > You can't EXPORT_SYMBOL(_GPL) an inline function.. can you?
> > :-)
> 
> Even though, register_*probes are exported as GPL symbol.
> So, you can't use register_*probe from non-GPL module. 

Aside from whether they are _GPL or not, its not the right thing to do.

> Anyway, it just comes from code maintainability, so we
> can deal with it separately(ex. as a cleanup patch etc.).

Nope! I don't think we'd want to make currently exported functions
inline.
Comment 58 Masami Hiramatsu 2008-03-04 16:15:40 UTC
(In reply to comment #57)
> > Anyway, it just comes from code maintainability, so we
> > can deal with it separately(ex. as a cleanup patch etc.).
> 
> Nope! I don't think we'd want to make currently exported functions
> inline.

Would you concern about ABI compatibility?
I also concern about maintainability, because (un)register_*probes can
process 1 to n probes. I think, these would better be merged into
1 implementation internally.

Comment 59 Masami Hiramatsu 2008-03-04 16:33:20 UTC
(In reply to comment #55)
> You can't EXPORT_SYMBOL(_GPL) an inline function.. can you?

(I might mis-explain below...)
In my idea, those inline functions are defined in linux/kprobes.h.
So, you don't need to export those symbols.
However, this change breaks ABI compatibility.
Comment 60 Ananth Mavinakayanahalli 2008-03-05 04:32:28 UTC
Subject: Re:  request new batch
	registration/unregistration API

> > You can't EXPORT_SYMBOL(_GPL) an inline function.. can you?
> 
> (I might mis-explain below...)
> In my idea, those inline functions are defined in linux/kprobes.h.
> So, you don't need to export those symbols.
> However, this change breaks ABI compatibility.

1. I don't want to break the ABI unless its absolutely necessary.
2. Just the sheer size of the register/unregister_* functions makes them
   a bad candidate for inlining.
3. Just dropping the EXPORT_SYMBOLs without a lead time is frowned upon
   in the community. And removing them => KABI breakage.

There have been cases recently where bigger functions that were
initially inline, have been converted to calls; we seem to be doing it
backwards here.

We have *probes as an _interface_ to the kernel, and I'd like to leave
it that way, rather than make them just header defined inline functions.
Comment 61 Masami Hiramatsu 2008-03-05 05:03:10 UTC
(In reply to comment #60)
> 1. I don't want to break the ABI unless its absolutely necessary.
> 2. Just the sheer size of the register/unregister_* functions makes them
>    a bad candidate for inlining.
> 3. Just dropping the EXPORT_SYMBOLs without a lead time is frowned upon
>    in the community. And removing them => KABI breakage.
> 
> There have been cases recently where bigger functions that were
> initially inline, have been converted to calls; we seem to be doing it
> backwards here.
> 
> We have *probes as an _interface_ to the kernel, and I'd like to leave
> it that way, rather than make them just header defined inline functions.
> 

Thank you. so I'll leave these ABIs and symbols.
And, I'll change just the implementation of unregister_*probe to use
unregister_*probes for avoiding code duplication.

Thanks again,
Comment 62 Jim Keniston 2008-03-08 00:21:54 UTC
(In reply to comment #61)

I promised Ananth and Masami to comment on this, so here goes.

> (In reply to comment #60)
> > 1. I don't want to break the ABI unless its absolutely necessary.

I don't see how changing an exported function to an inline function in kprobes.h
breaks the ABI.  The function is still available.  Are you concerned that it is
now available to non_GPL modules?

The main danger (an unlikely one) that I see in changing a __kprobes function to
an inline function is that gcc may at some time decide NOT to inline it, which
would leave it open for probing.  Some time back, a kprobes maintainer had to
change a bunch of inline functions to __kprobes at akpm's instigation, for that
reason. 

> > 2. Just the sheer size of the register/unregister_* functions makes them
> >    a bad candidate for inlining.

But the register/unregister_* functions would become very small -- just one-line
calls to the corresponding bulk-registration functions.

> > 3. Just dropping the EXPORT_SYMBOLs without a lead time is frowned upon
> >    in the community. And removing them => KABI breakage.
> > 
> > There have been cases recently where bigger functions that were
> > initially inline, have been converted to calls; we seem to be doing it
> > backwards here.
> > 
> > We have *probes as an _interface_ to the kernel, and I'd like to leave
> > it that way, rather than make them just header defined inline functions.
> > 
> 
> Thank you. so I'll leave these ABIs and symbols.
> And, I'll change just the implementation of unregister_*probe to use
> unregister_*probes for avoiding code duplication.
> 
> Thanks again,

I think it's fine to leave the [un]register_*probe functions as __kprobe
symbols, and reimplement them as one-line calls to the bulk functions.  I'm
always in favor of code simplification and reduction of code duplication, which
I think will be served here.  But be sure we don't lose track of who the real
caller is -- i.e., remember to pass around the right value from
__builtinr_retunr_address().
Comment 63 Masami Hiramatsu 2008-03-08 00:25:19 UTC
Created attachment 2312 [details]
[kprobe] batch registration for 2.6.25-rc3-mm1

Here is the updated patchset.
- Change interfaces to (un)register_*probes.
- Add a patch for list_singleton
- Check failures in unregister_*probes.

Thanks
Comment 64 Masami Hiramatsu 2008-03-08 00:44:26 UTC
(In reply to comment #62)
> I think it's fine to leave the [un]register_*probe functions as __kprobe
> symbols, and reimplement them as one-line calls to the bulk functions.  I'm
> always in favor of code simplification and reduction of code duplication, which
> I think will be served here.  But be sure we don't lose track of who the real
> caller is -- i.e., remember to pass around the right value from
> __builtinr_retunr_address().

Hmm, so how about this?
---
static int __register_kprobes(struct kprobe **kps, int num, unsigned long addr) {
  ...(real implementation)
}
int __kprobes register_kprobe(struct kprobe *p)
{
    return __register_kprobes(&p, 1, __builtin_return_address(0));
}
int __kprobes register_kprobes(struct kprobe **kps, int num)
{
    return __register_kprobes(kps, num, __builtin_return_address(0));
}
---
(new patchset already reimplemented unregister_*probe.)
Comment 65 Jim Keniston 2008-03-08 01:09:03 UTC
(In reply to comment #64)
...
> Hmm, so how about this?
> ---
> static int __register_kprobes(struct kprobe **kps, int num, unsigned long addr) {
>   ...(real implementation)
> }
> int __kprobes register_kprobe(struct kprobe *p)
> {
>     return __register_kprobes(&p, 1, __builtin_return_address(0));
> }
> int __kprobes register_kprobes(struct kprobe **kps, int num)
> {
>     return __register_kprobes(kps, num, __builtin_return_address(0));
> }
> ---
> (new patchset already reimplemented unregister_*probe.)
> 

That looks fine to me.  Thanks.
Comment 66 Ananth Mavinakayanahalli 2008-03-10 06:53:24 UTC
Proposal in comment 64 is fine.

I don't like the idea of exporting a core kernel functionality through just a
header file; so removing the EXPORT_SYMBOL is out of the question. Wrapping the
main calls to call the same routine internally is OK.

Also, patch 1 of the series (add_list_singleton.patch) can just go upstream
independently.
Comment 67 Masami Hiramatsu 2008-03-11 20:59:29 UTC
Created attachment 2319 [details]
[kprobe] batch registration for 2.6.25-rc5-mm1

Updated patches against 2.6.25-rc5-mm1
 - reimplement register_*probe as wrapping register_*probes
 - fix a bug in register_jprobes.

Thanks,
Comment 68 Masami Hiramatsu 2008-03-11 21:06:27 UTC
Created attachment 2320 [details]
[test] batch (un)registration test program for new patchs (for *register_*probes)

Updated testtool for (un)register_*probes().
Comment 69 Masami Hiramatsu 2008-03-11 21:08:28 UTC
Created attachment 2321 [details]
[stap] support batch unregistration of kprobes take 3

Updated patch for systemtap.
Comment 70 Ananth Mavinakayanahalli 2008-03-12 11:49:36 UTC
Overall the patches look fine. Thanks!

Any reason why there is code duplication in jprobe-batch_registration.patch?
Basically the portions that assign entry, pre_ and break_handlers are duplicated.
Comment 71 Ananth Mavinakayanahalli 2008-03-12 13:06:26 UTC
Created attachment 2322 [details]
Fix testtool to run on powerpc also

Masami,
I needed this patch to get the test module to run seamlessly on powerpc
Comment 72 Ananth Mavinakayanahalli 2008-03-12 13:06:26 UTC
Created attachment 2323 [details]
Fix testtool to run on powerpc also

Masami,
I needed this patch to get the test module to run seamlessly on powerpc
Comment 73 Masami Hiramatsu 2008-03-12 15:31:49 UTC
Created attachment 2324 [details]
[kprobe] batch registration for 2.6.25-rc5-mm1 (fixed)

(In reply to comment #70)
> Overall the patches look fine. Thanks!
> 
> Any reason why there is code duplication in jprobe-batch_registration.patch?
> Basically the portions that assign entry, pre_ and break_handlers are
duplicated.

Oops, it was my fault. I uploaded an old patches...
Here is new (and correct) one.
Comment 74 Masami Hiramatsu 2008-03-21 23:34:43 UTC
Created attachment 2334 [details]
[test-kprobe] add batch registration testcases to test_kprobes.c

These patches add batch registration testcases to test_kprobes.c and introduce
indirect function calls instead of a tricky code for preventing gcc
optimization.
Comment 75 Ananth Mavinakayanahalli 2008-03-24 05:00:33 UTC
Masami,
Isn't it preferable to have a dummy inline asm like

asm ("")

in the probed function instead? Check 38332cb98772f5ea757e6486bed7ed0381cb5f98
in Linus' kernel for an example.
Comment 76 Masami Hiramatsu 2008-03-25 15:40:43 UTC
(In reply to comment #75)
> Masami,
> Isn't it preferable to have a dummy inline asm like
> 
> asm ("")
> 
> in the probed function instead? Check 38332cb98772f5ea757e6486bed7ed0381cb5f98
> in Linus' kernel for an example.

Ananth, I just worried about portability.
Is that inline asm supported on any arch? If so, I can use the idea.
(Anyway, it may require some comment for explaining its purpose)
Comment 77 Ananth Mavinakayanahalli 2008-03-26 03:10:22 UTC
Subject: Re:  request new batch
	registration/unregistration API

On Tue, Mar 25, 2008 at 03:40:43PM -0000, mhiramat at redhat dot com wrote:
> 
> Ananth, I just worried about portability.
> Is that inline asm supported on any arch? If so, I can use the idea.
> (Anyway, it may require some comment for explaining its purpose)

Yes, it will require a comment to explain its purpose, but if you see
the commit I cited in the earlier update, that change is in
include/linux/<file> which is arch agnostic. Basic asm directives
without reference to specific arch registers are supposedly portable.

Maybe its worth a try similar to the time.h change.
Comment 78 Masami Hiramatsu 2008-03-27 00:59:48 UTC
Created attachment 2338 [details]
[test-kprobe] add batch registration testcases to test_kprobes.c take2

Ananth, I updated the patches to use dummy asm("").
Could you test it on ppc64?
Comment 79 Ananth Mavinakayanahalli 2008-03-27 07:59:59 UTC
Masami,
With just kprobes-fix-test.patch the test passes on powerpc which means the asm
directive is working. However, with all the patches applied, I get:

Kprobe smoke test failed: kprobe pre_handler2 not called
Kprobe smoke test failed: jprobe handler2 not called
Kprobe smoke test failed: kretprobe handler2 not called
BUG: Kprobe smoke test: 3 error(s) running handlers

Haven't been able to dig further...
Comment 80 Ananth Mavinakayanahalli 2008-03-27 10:42:39 UTC
Re comment #79, I may have spoken too early. If kprobe_target*() is modified to
use the older get_kprobe() instead, tests pass. It looks like asm() works, but
not all the time :-(

Masami,
batchtest-20080321 that uses function pointers instead works fine. Lets go with
that. Sorry for having led you down the wrong path.
Comment 81 Masami Hiramatsu 2008-04-23 14:19:51 UTC
- kernel patch was already merged to 2.6.25-mm1 (still in the mm tree).
- systemtap patch was committed. 
We can use it on 2.6.25-mm1 with the latest systemtap.

Thanks,