This is the mail archive of the mailing list for the glibc project.

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH 3/3] posix: New Linux posix_spawn{p} implementation

On 12-04-2016 17:20, Adhemerval Zanella wrote:
> On 12-04-2016 15:15, Adhemerval Zanella wrote:
>> On 12-04-2016 15:06, Szabolcs Nagy wrote:
>>> On 29/02/16 18:33, Adhemerval Zanella wrote:
>>>> This patch implements a new posix_spawn{p} implementation for Linux.  The main
>>>> difference is it uses the clone syscall directly with CLONE_VM and CLONE_VFORK
>>>> flags and a direct allocated stack.  The new stack and start function solves
>>>> most the vfork limitation (possible parent clobber due stack spilling).  The
>>>> remaning issue are related to signal handling:
>>>>   1. That no signal handlers must run in child context, to avoid corrupt
>>>>      parent's state.
>>>>   2. Child must synchronize with parent to enforce stack deallocation and
>>>>      to possible return execv issues.
>>>> The first one is solved by blocking all signals in child, even NPTL-internal
>>>> ones (SIGCANCEL and SIGSETXID).  The second issue is done by a stack allocation
>>>> in parent and a synchronization with using a pipe or waitpid (in case or error).
>>>> The pipe has the advantage of allowing the child signal an exec error (checked
>>>> with new tst-spawn2 test).
>>>> There is an inherent race condition in pipe2 usage for architectures that do not
>>>> support the syscall directly.  In such cases the a pipe plus fctnl is used
>>>> instead and it may lead to file descriptor leak in parent (as decribed by fcntl
>>>> documentation).
>>>> The child process stack is allocate with a mmap with MAP_STACK flag using
>>>> default architecture stack size.  Although it is slower than use a stack buffer
>>>> from parent, it allows some slack for the compatibility code to run scripts
>>>> with no shebang (which may use a buffer with size depending of argument list
>>>> count).
>>>> Performance should be similar to the vfork default posix implementation and
>>>> way faster than fork path (vfork on mostly linux ports are basically
>>>> clone with CLONE_VM plus CLONE_VFORK).  The only difference is the syscalls
>>>> required for the stack allocation/deallocation.
>>>> It fixes BZ#10354, BZ#14750, and BZ#18433.
>>>> Tested on i386, x86_64, powerpc64le, and aarch64.
>>> on aarch64 this caused
>>> FAIL: nptl/tst-exec1
>>> with error msg "join in thread in parent returned!?"
>>> i think the cause is that glibc clone sets
>>> tcb.tid = = -1 if CLONE_VM|CLONE_VFORK
>>> is used, which means the vfork child clobbers
>>> the parent tcb.tid. (and pthread_join tests
>>> tcb.tid and thus fails in tst-exec1.)
>>> i believe all targets are affected.
>> I think the problem is glibc's vfork only changes the TCB's pid fields
>> while fork changes both pid and tid in case on CLONE_VM.  Now I am trying
>> to found out why exactly this difference handling of internal fields...
> In fact I think clone.S should not reset the tid member in TCB for
> fork(CLONE_VM|...) and it is what vfork does internally. It is 
> used on INVALID_TD_P and INVALID_NOT_TERMINATED_TD_P macros used on
> pthread_join and pthread_cancel. A value equal to -1 will issue
> an error for such case, which mean that if a multithread program
> issues a clone(CLONE_VM|...) from a posix_spawn, a subsequent
> pthread_join might fail due invalid tid value.
> And I think it is what is happening in your system on tst-exec1:
> although pthread_create is issue before posix_spawn, I think 
> posix_spawn clone might be executing before the pthread_join
> in some cases leading to this issue.
> I will propose a patch to modify clone to not reset the tid value.

But I am not sure if we just should avoid touch any internal TCB
headers if CLONE_VFORK is set, thoughts?

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]