Bug 5042 - procfs probe script not recreating /proc entries in some cases
Summary: procfs probe script not recreating /proc entries in some cases
Status: RESOLVED WORKSFORME
Alias: None
Product: systemtap
Classification: Unclassified
Component: runtime (show other bugs)
Version: unspecified
: P2 normal
Target Milestone: ---
Assignee: Unassigned
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2007-09-17 19:15 UTC by Mike Mason
Modified: 2010-01-20 19:01 UTC (History)
1 user (show)

See Also:
Host:
Target:
Build:
Last reconfirmed:


Attachments
possible fix (420 bytes, patch)
2007-09-17 21:20 UTC, David Smith
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Mike Mason 2007-09-17 19:15:22 UTC
I just tried out the new procfs probe feature (very cool, BTW) and found a bug.
   The bug can be reproduced as follows:

1. Run a procfs probe script that inserts an entry in /proc (e.g.,
/proc/systemtap/stap_d116efa3785a073ecd3b45ae46950a46_72240/foo).

2. CD to /proc/systemtap/stap_d116efa3785a073ecd3b45ae46950a46_72240 in another
xterm.

3. Ctrl-c out of the probe script.  You'll see messages like the following in
/var/log/messages:

Sep 17 11:51:20 localhost kernel: remove_proc_entry:
systemtap/stap_d116efa3785a073ecd3b45ae46950a46_72240 busy, count=1
Sep 17 11:51:20 localhost kernel: remove_proc_entry: /proc/systemtap busy, count=1

...meaning the deletions have been deferred.

4. Run the procfs probe script again, then CD to /proc.  There's no
/proc/systemtap entry.  I don't see any error messages that indicate
/proc/systemtap could not be added.  I had to reboot the system before I could
get the script to add /proc/systemtap again.
Comment 1 Frank Ch. Eigler 2007-09-17 20:18:50 UTC
This is just the sort of thing that the file_operations->owner
field was built for.
Comment 2 David Smith 2007-09-17 21:20:16 UTC
Created attachment 2010 [details]
possible fix

Can you try this patch and see what you think?
Comment 3 Mike Mason 2007-09-18 20:30:33 UTC
(In reply to comment #2)
> Created an attachment (id=2010)
> possible fix
> 
> Can you try this patch and see what you think?

The behavior is different with this patch, but still not correct.  Here's what I
did to test: 

1. Run "staprun <module>.ko" in an xterm.  The module creates /proc/systemtap,
/proc/systemtap/<module>, and /proc/systemtap/<module>/<value>.

2. In a second xterm, cd to /proc/systemtap/<module>.

3. Ctrl-c out of staprun.  The deferred deletion messages appear in
/var/log/messages as before, but staprun doesn't exit yet.

4. cd ../ in the second xterm.

5. The <module> directory is now deleted and staprun exits.  From a third xterm,
do "ls /proc".  /proc/systemtap does not appear.  However, pwd in the second
xterm indicates it's still there.  It must be in some interim state (deleted,
but not completely).

6. Reload the module again with "staprun <module>.ko".  path_lookup() finds the
existing /proc/systemtap and, thus, doesn't recreate it.  The module just links
<module> and <module>/<value> off of the existing /proc/systemtap.

7. /proc/systemtap can only be accessed in the second xterm.  As soon as I cd
out of /proc/systemtap, it's no longer accessible from a shell even though it
exists in some form.

One solution might be to simply not delete /proc/systemtap.  Let the first
module create it, then leave it around for other modules to use even if the
first module is removed.
  

Comment 4 Martin Hunt 2007-09-19 13:04:27 UTC
Checked in a fix for this.

There were several related problems I fixed that all involved problems with
directories that are awaiting deletion, but getting reused.  I applied the
attached ownership patch because it had the effect of making the deletion finish
before the module gets removed.  I had thought this was annoying, but it removes
problems caused by running the same script again while the original scripts's
path elements were all marked as awaiting deletion.

The problem reported in this BZ was caused when /proc/systemtap was marked as
awaiting deletion while new scripts kept reusing it.  path_lookup() could see
it, while "ls" could not.

There was no easy fix for this problem.  So the new behavior is for
/proc/systemtap to never be in a deferred deletion state.  If it is in use when
a module exits, it will not be deleted.  Next time a systemtap module exits, if
not in use, it will be deleted.  

Comment 5 David Smith 2008-04-22 17:38:14 UTC
(In reply to comment #4)
> Checked in a fix for this.

Unfortunately, fedora x86 8 kernels don't really like this fix.  If you run a
systemtap script that uses procfs, then immediately kill it (no procfs
reads/writes are necessary), you will get the new warning about removal being
deferred.  Under fedora 8, I'm not sure you can use procfs at all without
getting the warning.  Under RHEL5 x86_64, I don't see the warning.

(I'm not sure how much of a problem getting the warning is, but it is causing a
spurious test failure for systemtap.base/procfs.exp.)

The following script demonstrates getting the warning when we shouldn't.

# stap -e 'probe procfs("command").read { $value = "100" }' -m foo
Warning: using '-m' disables cache support.
[interrupt script here]
WARNING: Removal of /proc/systemtap/foo
is deferred until it is no longer in use.
Systemtap module removal will block.
Comment 6 David Smith 2010-01-20 19:01:45 UTC
I'm changing this one to WORKSFORME, since I can't duplicate it any more on
kernels 2.6.18-168.el5 or 2.6.32.3-21.fc13.x86_64.