In nptl/sysdeps/unix/sysv/linux/fork.c, as soon as the kernel syscall works, the following assertion is made: (line 141 in master branch) assert (THREAD_GETMEM (self, tid) != ppid); There are two issues with that assertion: 1. Since ppid is only defined if NDEBUG is not defined, from reading the source it appears to be that compiling with -DNDEBUG will never work. I didn't test that, just wanted to note that since I'm at it anyway. (This also applies to the other assertion in the sibling conditional branch) 2. More importantly, the assertion is not always true. It is nearly always true, but there is one specific exception to this, namely in conjunction with PID namespaces: If one wants to enter a pid namespace (Linux kernel 3.8+ or previous kernels with backported patches), one can use the setns(open("/proc/12345/ns/pid", O_RDONLY), CLONE_NEWPID) syscall to attach the current process to that pid namesapce. But the kernel doesn't really move the current process into that pid namespace, setns() will only cause the child processes to actually be in the selected pid namespace. The common way of handling that is to immediately fork() after entering a pid namespace. But that allows for the following situation (number are examples): Process PID outside (i.e. in root pid ns) PID in attached to pid ns ------------------------------------------------------------------------------ parent 42 - child 108 42 In that case, the comparison pid before fork != pid after fork holds true if one compares pids within the same namespace - but the ppid variable is gathered from the outside namespace while the current pid is gathered from the inside namespace, so in the above example they are equal, even though they refer to different processes. For the parent process, kernel calls to fork()/clone() will return the pid outside of the namespace (108 in this example), so waitpid() etc. work without a problem. Since PIDs are assigned semi-randomly, this situation is hard to reproduce, attaching to namespaces will work the vast majority of times, but it might fail if the above coincidence happens. This affects the nsenter utility from util-linux and also the lxc-attach utility from the lxc package - and possibly more. They will work most of the time but in some rare cases they will fail needlessly, stumbling over the assertion in fork(). The obvious solution is just to use clone() after setns() and never use fork() - and one can certainly patch both programs to do so. Nevertheless it would be nice to see if fork() also worked after setns(), especially since there is no inherent reason for it not to. I see four possible ways to proceed: a) Remove the assertion altogether b) Also provide a wrapper for setns() that sets a global flag if a PID namespace was entered. If so, skip the assertion, otherwise keep it. c) assert that EITHER old pid and new pid are unequal (current assertion) OR getppid() returns 0 (that is the case when setns was called in the parent process before fork()) d) Say that this is expected behavior and document that after setns() one should only do clone() and never fork().
If NDEBUG is defined then assert expands to nothing.
Created attachment 7938 [details] Double unshare CLONE_NEWPID example I've run into this assertion as well. I've attached a small test program that trips it using unshare(CLONE_NEWPID). In this case, the issue is that the parent process has pid 1 inside its pid namespace, but since the child is created in a new pid namespace, it also has pid 1. Will try to send a patch to get rid of the assert.
Synchronization facilities such as some of the PThreads mutexes rely on thread IDs being unique. I believe we should deal with this first, and drop the assertion afterwards.
Fixed in glibc 2.25.
*** Bug 21386 has been marked as a duplicate of this bug. ***