This is the mail archive of the
libc-alpha@sourceware.org
mailing list for the glibc project.
Re: [PATCH] [BZ #15392] Remove fork child pid assertion
- From: Ricky Zhou <rickyz at google dot com>
- To: "Carlos O'Donell" <carlos at redhat dot com>
- Cc: Torvald Riegel <triegel at redhat dot com>, libc-alpha at sourceware dot org
- Date: Tue, 18 Nov 2014 17:51:31 -0800
- Subject: Re: [PATCH] [BZ #15392] Remove fork child pid assertion
- Authentication-results: sourceware.org; auth=none
- References: <1416014955-5408-1-git-send-email-rickyz at chromium dot org> <1416220691 dot 4535 dot 385 dot camel at triegel dot csb> <CAJmxDmQDP8N7cLE9MZwB0jFSES=eodx8cAjGwNPZNQT_8mbrFA at mail dot gmail dot com> <1416321134 dot 4535 dot 515 dot camel at triegel dot csb> <546B6DAD dot 8020906 at redhat dot com>
Ah, I see, I didn't know about process-shared mutexes - those
definitely won't work across PID namespaces (and this should
definitely be documented). Do you know of any other places in glibc
where we may be assuming that PIDs are unique between processes that
may be in different PID namespaces? I think it'd be useful to have a
list of these.
I'm a little bit confused about why we'd want to disallow fork after
setns/unshare (since this is likely to be a common pattern that's
already in use). I'm not super familiar with glibc internals/APIs, but
are there many other instances where we rely on the uniqueness of PIDs
between two different processes apart from process-shared
mutexes/condvars/barriers? Would it be possible to document the
specific APIs that won't work across PID namespaces instead of
forbidding fork after setns/unshare with CLONE_NEWPID?
By the way, I am also interested in improving or extending the clone
API as well (let me know if you'd rather split this into a separate
thread). Two ideas that would solve our issues with the current clone
wrapper:
1) Provide an interface to reset the PID cache (this would allow us to
use the syscall directly).
2) Provide an alternate fork-like version of the clone wrapper. This
version would not take a child_stack, and would enforce that CLONE_VM
is not set. Aside from this, it would invalidate the PID cache and
perform the syscall.
Thanks,
Ricky
On Tue, Nov 18, 2014 at 8:02 AM, Carlos O'Donell <carlos@redhat.com> wrote:
> On 11/18/2014 09:32 AM, Torvald Riegel wrote:
>> On Mon, 2014-11-17 at 14:20 -0800, Ricky Zhou wrote:
>>> On Mon, Nov 17, 2014 at 2:38 AM, Torvald Riegel <triegel@redhat.com> wrote:
>>>> On Fri, 2014-11-14 at 17:29 -0800, Ricky Zhou wrote:
>>>>> This assertion is no longer always true, since a forked child may be in
>>>>> a different PID namespace than its parent, and the two namespace may
>>>>> have PID collisions.
>>>>
>>>> If that's true, process-shared synchronization facilities based on the
>>>> thread ID (e.g., mutexes) might not work anymore.
>>>>
>>>> Do we at least need a note in the documentation that this is the case?
>>>> Or do we still need to implement it?
>>>>
>>>> Is there any glibc-internal synchronization that this could break?
>>>>
>>>> I believe we deal with these issues before dropping the assert.
>>> Within a PID namespace, all thread IDs should remain unique, and a
>>> process cannot have threads in different PID namespaces
>>> (http://lxr.free-electrons.com/source/kernel/fork.c#L1217), so I'm
>>> having trouble coming up with a situation where removing this assert
>>> would allow badness to happen down the line.
>>
>> Okay, so private (intra-process) mutexes and such are safe. However,
>> what about process-shared mutexes?
>
> I agree with Torvald, and I have further comments.
>
> All of pthreads expects POSIX requirements, which are that the PIDs
> are stable for the life of the process. If setns is called, then the
> fork'd child violates that invariant and enters undefined behaviour.
>
> The only tenable solution I see is documenting that you must use clone
> to enter the new namespace. The documentation should note that fork
> is forbidden after setns and unshare.
>
> The patch I would like to see is:
>
> * Add minimal stub entry for `setns` and `unshare` to
> manual/threads.texi and mention that after calling this
> function that fork must not be used, and clone is the
> only safe function to call.
>
> If you have problems using clone, then we need to talk about that
> interface. The feedback from lxc has been that they would like not
> to have to worry about the stack requirement in clone(), but we
> haven't done anything to make that easier.
>
> Cheers,
> Carlos.
>
>