This is the mail archive of the
libc-alpha@sourceware.org
mailing list for the glibc project.
Re: [PATCH] [BZ #15392] Remove fork child pid assertion
- From: "Carlos O'Donell" <carlos at redhat dot com>
- To: Ricky Zhou <rickyz at google dot com>
- Cc: Torvald Riegel <triegel at redhat dot com>, libc-alpha at sourceware dot org
- Date: Tue, 18 Nov 2014 22:57:36 -0500
- Subject: Re: [PATCH] [BZ #15392] Remove fork child pid assertion
- Authentication-results: sourceware.org; auth=none
- References: <1416014955-5408-1-git-send-email-rickyz at chromium dot org> <1416220691 dot 4535 dot 385 dot camel at triegel dot csb> <CAJmxDmQDP8N7cLE9MZwB0jFSES=eodx8cAjGwNPZNQT_8mbrFA at mail dot gmail dot com> <1416321134 dot 4535 dot 515 dot camel at triegel dot csb> <546B6DAD dot 8020906 at redhat dot com> <CAJmxDmQd3KdYf2rd_=Tv-_gCBkbmeifBABCGTrz=NJb1hvr6=A at mail dot gmail dot com>
On 11/18/2014 08:51 PM, Ricky Zhou wrote:
> Ah, I see, I didn't know about process-shared mutexes - those
> definitely won't work across PID namespaces (and this should
> definitely be documented). Do you know of any other places in glibc
> where we may be assuming that PIDs are unique between processes that
> may be in different PID namespaces? I think it'd be useful to have a
> list of these.
PID uniqueness is a requirement for POSIX, everything can rely on it.
Documenting the functions which depend on it will lock down the
implementation's flexibility.
The uniqueness of the PID continues to be true *before* you enter
the namespace and then *after* you enter the namespace. The only
problem is during the transition for the child.
We should clearly define *how* you transition and limit that transition
to a simple set of APIs.
> I'm a little bit confused about why we'd want to disallow fork after
> setns/unshare (since this is likely to be a common pattern that's
> already in use). I'm not super familiar with glibc internals/APIs, but
> are there many other instances where we rely on the uniqueness of PIDs
> between two different processes apart from process-shared
> mutexes/condvars/barriers? Would it be possible to document the
> specific APIs that won't work across PID namespaces instead of
> forbidding fork after setns/unshare with CLONE_NEWPID?
You are asking to support a POSIX API with a set of semantics that
violate the contract of that API. I would like to avoid this.
Using clone is fine, since clone has none of the POSIX requirements
that fork, or vfork have.
I would like to see one well documented way to transition to the child
safely after which you should be able to continue calling libc functions
including fork and vfork.
Today, the easiest solution is:
"After setns/unshare you must clone followed immediately by exec"
The exec starts the process over again in the new namespace and everything
works fine. The problem is that this is resource intensive and not what
users want.
> By the way, I am also interested in improving or extending the clone
> API as well (let me know if you'd rather split this into a separate
> thread). Two ideas that would solve our issues with the current clone
> wrapper:
Then let us go in this direction, make clone better, and recommend
it as the only way to create a new process after setns/unshare.
> 1) Provide an interface to reset the PID cache (this would allow us to
> use the syscall directly).
Why do you need this if you have a version of clone that resets it for you?
> 2) Provide an alternate fork-like version of the clone wrapper. This
> version would not take a child_stack, and would enforce that CLONE_VM
> is not set. Aside from this, it would invalidate the PID cache and
> perform the syscall.
I think a new fork-like clone wrapper should be acceptable, we'll have
to iterate on the design a bit, but I don't object to the idea.
We could document it as being the one safe way to create a new process
in the new namespace.
Cheers,
Carlos.