This is the mail archive of the
libc-alpha@sourceware.org
mailing list for the glibc project.
Re: [PATCH 3/3] posix: New Linux posix_spawn{p} implementation
- From: Adhemerval Zanella <adhemerval dot zanella at linaro dot org>
- To: Szabolcs Nagy <szabolcs dot nagy at arm dot com>, libc-alpha at sourceware dot org
- Cc: nd at arm dot com
- Date: Tue, 12 Apr 2016 17:35:19 -0300
- Subject: Re: [PATCH 3/3] posix: New Linux posix_spawn{p} implementation
- Authentication-results: sourceware.org; auth=none
- References: <1456770820-21341-1-git-send-email-adhemerval dot zanella at linaro dot org> <1456770820-21341-4-git-send-email-adhemerval dot zanella at linaro dot org> <570D390D dot 1050708 at arm dot com> <570D3B2D dot 8050505 at linaro dot org> <570D58A8 dot 1080406 at linaro dot org>
On 12-04-2016 17:20, Adhemerval Zanella wrote:
>
>
> On 12-04-2016 15:15, Adhemerval Zanella wrote:
>>
>>
>> On 12-04-2016 15:06, Szabolcs Nagy wrote:
>>> On 29/02/16 18:33, Adhemerval Zanella wrote:
>>>> This patch implements a new posix_spawn{p} implementation for Linux. The main
>>>> difference is it uses the clone syscall directly with CLONE_VM and CLONE_VFORK
>>>> flags and a direct allocated stack. The new stack and start function solves
>>>> most the vfork limitation (possible parent clobber due stack spilling). The
>>>> remaning issue are related to signal handling:
>>>>
>>>> 1. That no signal handlers must run in child context, to avoid corrupt
>>>> parent's state.
>>>> 2. Child must synchronize with parent to enforce stack deallocation and
>>>> to possible return execv issues.
>>>>
>>>> The first one is solved by blocking all signals in child, even NPTL-internal
>>>> ones (SIGCANCEL and SIGSETXID). The second issue is done by a stack allocation
>>>> in parent and a synchronization with using a pipe or waitpid (in case or error).
>>>> The pipe has the advantage of allowing the child signal an exec error (checked
>>>> with new tst-spawn2 test).
>>>>
>>>> There is an inherent race condition in pipe2 usage for architectures that do not
>>>> support the syscall directly. In such cases the a pipe plus fctnl is used
>>>> instead and it may lead to file descriptor leak in parent (as decribed by fcntl
>>>> documentation).
>>>>
>>>> The child process stack is allocate with a mmap with MAP_STACK flag using
>>>> default architecture stack size. Although it is slower than use a stack buffer
>>>> from parent, it allows some slack for the compatibility code to run scripts
>>>> with no shebang (which may use a buffer with size depending of argument list
>>>> count).
>>>>
>>>> Performance should be similar to the vfork default posix implementation and
>>>> way faster than fork path (vfork on mostly linux ports are basically
>>>> clone with CLONE_VM plus CLONE_VFORK). The only difference is the syscalls
>>>> required for the stack allocation/deallocation.
>>>>
>>>> It fixes BZ#10354, BZ#14750, and BZ#18433.
>>>>
>>>> Tested on i386, x86_64, powerpc64le, and aarch64.
>>>>
>>>
>>> on aarch64 this caused
>>>
>>> FAIL: nptl/tst-exec1
>>>
>>> with error msg "join in thread in parent returned!?"
>>>
>>> i think the cause is that glibc clone sets
>>> tcb.tid = tcb.pid = -1 if CLONE_VM|CLONE_VFORK
>>> is used, which means the vfork child clobbers
>>> the parent tcb.tid. (and pthread_join tests
>>> tcb.tid and thus fails in tst-exec1.)
>>> i believe all targets are affected.
>>
>> I think the problem is glibc's vfork only changes the TCB's pid fields
>> while fork changes both pid and tid in case on CLONE_VM. Now I am trying
>> to found out why exactly this difference handling of internal fields...
>
> In fact I think clone.S should not reset the tid member in TCB for
> fork(CLONE_VM|...) and it is what vfork does internally. It is
> used on INVALID_TD_P and INVALID_NOT_TERMINATED_TD_P macros used on
> pthread_join and pthread_cancel. A value equal to -1 will issue
> an error for such case, which mean that if a multithread program
> issues a clone(CLONE_VM|...) from a posix_spawn, a subsequent
> pthread_join might fail due invalid tid value.
>
> And I think it is what is happening in your system on tst-exec1:
> although pthread_create is issue before posix_spawn, I think
> posix_spawn clone might be executing before the pthread_join
> in some cases leading to this issue.
>
> I will propose a patch to modify clone to not reset the tid value.
>
But I am not sure if we just should avoid touch any internal TCB
headers if CLONE_VFORK is set, thoughts?