This is the mail archive of the glibc-bugs@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[Bug nptl/15392] New: Linux, setns, PID namespaces, fork: assert() about pid inequality hit sporadically


http://sourceware.org/bugzilla/show_bug.cgi?id=15392

             Bug #: 15392
           Summary: Linux, setns, PID namespaces, fork: assert() about pid
                    inequality hit sporadically
           Product: glibc
           Version: 2.18
            Status: NEW
          Severity: minor
          Priority: P2
         Component: nptl
        AssignedTo: unassigned@sourceware.org
        ReportedBy: christian@iwakd.de
                CC: drepper.fsp@gmail.com
    Classification: Unclassified


In nptl/sysdeps/unix/sysv/linux/fork.c, as soon as the kernel syscall works,
the following assertion is made: (line 141 in master branch)

      assert (THREAD_GETMEM (self, tid) != ppid);

There are two issues with that assertion:

 1. Since ppid is only defined if NDEBUG is not defined, from reading the
    source it appears to be that compiling with -DNDEBUG will never work.
    I didn't test that, just wanted to note that since I'm at it anyway.
    (This also applies to the other assertion in the sibling conditional
    branch)

 2. More importantly, the assertion is not always true. It is nearly always
    true, but there is one specific exception to this, namely in conjunction
    with PID namespaces:

If one wants to enter a pid namespace (Linux kernel 3.8+ or previous kernels
with backported patches), one can use the setns(open("/proc/12345/ns/pid",
O_RDONLY), CLONE_NEWPID) syscall to attach the current process to that pid
namesapce. But the kernel doesn't really move the current process into that pid
namespace, setns() will only cause the child processes to actually be in the
selected pid namespace.

The common way of handling that is to immediately fork() after entering a pid
namespace. But that allows for the following situation (number are examples):

  Process    PID outside (i.e. in root pid ns)       PID in attached to pid ns
 ------------------------------------------------------------------------------
  parent       42                                       -
  child        108                                      42

In that case, the comparison pid before fork != pid after fork holds true if
one compares pids within the same namespace - but the ppid variable is gathered
from the outside namespace while the current pid is gathered from the inside
namespace, so in the above example they are equal, even though they refer to
different processes.

For the parent process, kernel calls to fork()/clone() will return the pid
outside of the namespace (108 in this example), so waitpid() etc. work without
a problem.

Since PIDs are assigned semi-randomly, this situation is hard to reproduce,
attaching to namespaces will work the vast majority of times, but it might fail
if the above coincidence happens.

This affects the nsenter utility from util-linux and also the lxc-attach
utility from the lxc package - and possibly more. They will work most of the
time but in some rare cases they will fail needlessly, stumbling over the
assertion in fork().

The obvious solution is just to use clone() after setns() and never use fork()
- and one can certainly patch both programs to do so. Nevertheless it would be
nice to see if fork() also worked after setns(), especially since there is no
inherent reason for it not to.

I see four possible ways to proceed:

 a) Remove the assertion altogether

 b) Also provide a wrapper for setns() that sets a global flag if a PID
    namespace was entered. If so, skip the assertion, otherwise keep it.

 c) assert that
       EITHER old pid and new pid are unequal (current assertion)
       OR getppid() returns 0 (that is the case when setns was called
                               in the parent process before fork())

 d) Say that this is expected behavior and document that after setns()
    one should only do clone() and never fork().

-- 
Configure bugmail: http://sourceware.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are on the CC list for the bug.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]