This is the mail archive of the
glibc-bugs@sourceware.org
mailing list for the glibc project.
[Bug nptl/15392] New: Linux, setns, PID namespaces, fork: assert() about pid inequality hit sporadically
- From: "christian at iwakd dot de" <sourceware-bugzilla at sourceware dot org>
- To: glibc-bugs at sourceware dot org
- Date: Tue, 23 Apr 2013 18:16:16 +0000
- Subject: [Bug nptl/15392] New: Linux, setns, PID namespaces, fork: assert() about pid inequality hit sporadically
- Auto-submitted: auto-generated
http://sourceware.org/bugzilla/show_bug.cgi?id=15392
Bug #: 15392
Summary: Linux, setns, PID namespaces, fork: assert() about pid
inequality hit sporadically
Product: glibc
Version: 2.18
Status: NEW
Severity: minor
Priority: P2
Component: nptl
AssignedTo: unassigned@sourceware.org
ReportedBy: christian@iwakd.de
CC: drepper.fsp@gmail.com
Classification: Unclassified
In nptl/sysdeps/unix/sysv/linux/fork.c, as soon as the kernel syscall works,
the following assertion is made: (line 141 in master branch)
assert (THREAD_GETMEM (self, tid) != ppid);
There are two issues with that assertion:
1. Since ppid is only defined if NDEBUG is not defined, from reading the
source it appears to be that compiling with -DNDEBUG will never work.
I didn't test that, just wanted to note that since I'm at it anyway.
(This also applies to the other assertion in the sibling conditional
branch)
2. More importantly, the assertion is not always true. It is nearly always
true, but there is one specific exception to this, namely in conjunction
with PID namespaces:
If one wants to enter a pid namespace (Linux kernel 3.8+ or previous kernels
with backported patches), one can use the setns(open("/proc/12345/ns/pid",
O_RDONLY), CLONE_NEWPID) syscall to attach the current process to that pid
namesapce. But the kernel doesn't really move the current process into that pid
namespace, setns() will only cause the child processes to actually be in the
selected pid namespace.
The common way of handling that is to immediately fork() after entering a pid
namespace. But that allows for the following situation (number are examples):
Process PID outside (i.e. in root pid ns) PID in attached to pid ns
------------------------------------------------------------------------------
parent 42 -
child 108 42
In that case, the comparison pid before fork != pid after fork holds true if
one compares pids within the same namespace - but the ppid variable is gathered
from the outside namespace while the current pid is gathered from the inside
namespace, so in the above example they are equal, even though they refer to
different processes.
For the parent process, kernel calls to fork()/clone() will return the pid
outside of the namespace (108 in this example), so waitpid() etc. work without
a problem.
Since PIDs are assigned semi-randomly, this situation is hard to reproduce,
attaching to namespaces will work the vast majority of times, but it might fail
if the above coincidence happens.
This affects the nsenter utility from util-linux and also the lxc-attach
utility from the lxc package - and possibly more. They will work most of the
time but in some rare cases they will fail needlessly, stumbling over the
assertion in fork().
The obvious solution is just to use clone() after setns() and never use fork()
- and one can certainly patch both programs to do so. Nevertheless it would be
nice to see if fork() also worked after setns(), especially since there is no
inherent reason for it not to.
I see four possible ways to proceed:
a) Remove the assertion altogether
b) Also provide a wrapper for setns() that sets a global flag if a PID
namespace was entered. If so, skip the assertion, otherwise keep it.
c) assert that
EITHER old pid and new pid are unequal (current assertion)
OR getppid() returns 0 (that is the case when setns was called
in the parent process before fork())
d) Say that this is expected behavior and document that after setns()
one should only do clone() and never fork().
--
Configure bugmail: http://sourceware.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are on the CC list for the bug.