Bug 9788 - permissions error in staprun
Summary: permissions error in staprun
Status: RESOLVED FIXED
Alias: None
Product: systemtap
Classification: Unclassified
Component: runtime (show other bugs)
Version: unspecified
: P2 normal
Target Milestone: ---
Assignee: David Smith
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2009-01-25 02:44 UTC by Eugeniy Meshcheryakov
Modified: 2016-03-15 18:53 UTC (History)
1 user (show)

See Also:
Host:
Target:
Build:
Last reconfirmed:


Attachments
test program 1 (640 bytes, text/plain)
2009-01-29 17:17 UTC, David Smith
Details
test program 2 source (632 bytes, text/plain)
2009-01-29 17:18 UTC, David Smith
Details
test programs makefile (161 bytes, text/plain)
2009-01-29 17:18 UTC, David Smith
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Eugeniy Meshcheryakov 2009-01-25 02:44:26 UTC
There is some strange error in 20090124 snapshot:

% uname -a
Linux loki 2.6.29-rc2 #1 SMP PREEMPT Sun Jan 18 18:40:46 CET 2009 x86_64 GNU/Linux
% id
uid=1000(eugen) gid=1000(eugen)
groups=0(root),20(dialout),24(cdrom),25(floppy),29(audio),30(dip),44(video),46(plugdev),116(stapdev),1000(eugen)
% ls -l /usr/bin/staprun
-rwsr-xr-x 1 root root 31752 січ 25 03:10 /usr/bin/staprun
% ./helloworld.stp -v
Pass 1: parsed user script and 47 library script(s) in 280usr/0sys/304real ms.
Pass 2: analyzed script: 1 probe(s), 1 function(s), 0 embed(s), 0 global(s) in
10usr/0sys/4real ms.
Pass 3: using cached
/home/eugen/.systemtap/cache/11/stap_11c0f8dddd8437f12d5b2ecdd542a4fd_325.c
Pass 4: using cached
/home/eugen/.systemtap/cache/11/stap_11c0f8dddd8437f12d5b2ecdd542a4fd_325.ko
Pass 5: starting run.
Error inserting module
'/tmp/stapJa73Uz/stap_11c0f8dddd8437f12d5b2ecdd542a4fd_325.ko': File exists
Retrying, after attempted removal of module
stap_11c0f8dddd8437f12d5b2ecdd542a4fd_325 (rc 0)
hello world
ERROR: The effective user ID of staprun must be set to the root user.
  Check permissions on staprun and ensure it is a setuid root program.
Pass 5: run completed in 0usr/10sys/122real ms.
Pass 5: run failed.  Try again with another '--vp 00001' option.

Here staprun is suid-root, and stap is able to run staprun and it can remove old
module and load new one (it displays "hello world"), but after that staprun
complains that it is not suid-root and cannot remove the module.

Everything works fine when run under root.

I do not remember anything similar with 20090117 snapshot and 2.6.28 kernel.
Comment 1 Eugeniy Meshcheryakov 2009-01-25 03:05:26 UTC
After adding some debug print into staprun I have:

uid = 1000, euid = 0, pid = 20241
hello world
uid = 1000, euid = 1000, pid = 20241
ERROR: The effective user ID of staprun must be set to the root user.
  Check permissions on staprun and ensure it is a setuid root program.
Comment 2 Frank Ch. Eigler 2009-01-25 20:00:26 UTC
We believe this is a recent regression in the kernel, possibly
related to the user-credential patches to task_struct.
Comment 3 David Smith 2009-01-27 14:35:46 UTC
According to a git bisect of the kernel that I just finished, the regression is
caused by the following kernel change:

commit d84f4f992cbd76e8f39c488cf0c5d123843923b1
Author: David Howells <dhowells@redhat.com>
Date:   Fri Nov 14 10:39:23 2008 +1100

    CRED: Inaugurate COW credentials
    

Comment 4 David Smith 2009-01-28 18:12:48 UTC
I've added a workaround for this bug in commit 69aa1bd.  Originally staprun
exec's stapio, which exec's staprun when it is time to remove the module.  Now
staprun exec's stapio, which forks when it is time to remove the module.  The
new child exec's staprun.  The parent (stapio) waits for the child to finish,
then exits.
Comment 5 David Smith 2009-01-29 17:17:25 UTC
Created attachment 3697 [details]
test program 1
Comment 6 David Smith 2009-01-29 17:18:01 UTC
Created attachment 3698 [details]
test program 2 source
Comment 7 David Smith 2009-01-29 17:18:30 UTC
Created attachment 3699 [details]
test programs makefile
Comment 8 David Smith 2009-01-29 17:21:11 UTC
I've attached 2 small test programs and a Makefile that demonstrate this
problem.  While developing these test programs, I've discovered that the setuid
doesn't take effect only when the 2nd program (stapio for systemtap, test2.c for
the small test programs) creates a second thread.
Comment 9 Ananth Mavinakayanahalli 2009-01-30 02:16:15 UTC
Its time the problem is brought to lkml notice rather than working around it in
SystemTap -- this clearly looks like a regression, unless SystemTap was
depending on the feature's buggy behaviour earlier.
Comment 10 David Smith 2009-01-30 14:39:45 UTC
(In reply to comment #9)
> Its time the problem is brought to lkml notice rather than working around it in
> SystemTap -- this clearly looks like a regression, unless SystemTap was
> depending on the feature's buggy behaviour earlier.

I agree - I've filed redhat bugzilla #481783 against this and sent a message to
lkml (<http://lkml.indiana.edu/hypermail/linux/kernel/0901.3/02268.html>) with
the test programs included here.
Comment 11 David Smith 2009-02-04 18:52:14 UTC
This is also being tracked as linux kernel bug 12602
<http://bugzilla.kernel.org/show_bug.cgi?id=12602>
Comment 12 David Smith 2009-02-10 19:57:59 UTC
David Howells has posted a patch upstream that fixes this problem.

I've verified that this works correctly on 2.6.29-0.99.rc4.git1.fc11.x86_64.

I'll wait a week or so and remove the workaround.
Comment 13 Frank Ch. Eigler 2009-02-10 20:01:33 UTC
I suggest waiting till the next major release of systemtap.
The workaround present is costing us nothing.
Comment 14 Frank Ch. Eigler 2009-03-04 15:39:56 UTC
The kernel bug has been fixed, and our backup workaround is in place.
Comment 15 Frank Ch. Eigler 2016-03-15 18:53:51 UTC
from comment #13:
> The workaround present is costing us nothing.

... it turns out the workaround permits a race condition between stapio exiting (thus releasing the .cmd fd) and "staprun -d" starting (and trying to open the .cmd fd).