This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH 3/3] posix: New Linux posix_spawn{p} implementation



On 12-04-2016 15:15, Adhemerval Zanella wrote:
> 
> 
> On 12-04-2016 15:06, Szabolcs Nagy wrote:
>> On 29/02/16 18:33, Adhemerval Zanella wrote:
>>> This patch implements a new posix_spawn{p} implementation for Linux.  The main
>>> difference is it uses the clone syscall directly with CLONE_VM and CLONE_VFORK
>>> flags and a direct allocated stack.  The new stack and start function solves
>>> most the vfork limitation (possible parent clobber due stack spilling).  The
>>> remaning issue are related to signal handling:
>>>
>>>   1. That no signal handlers must run in child context, to avoid corrupt
>>>      parent's state.
>>>   2. Child must synchronize with parent to enforce stack deallocation and
>>>      to possible return execv issues.
>>>
>>> The first one is solved by blocking all signals in child, even NPTL-internal
>>> ones (SIGCANCEL and SIGSETXID).  The second issue is done by a stack allocation
>>> in parent and a synchronization with using a pipe or waitpid (in case or error).
>>> The pipe has the advantage of allowing the child signal an exec error (checked
>>> with new tst-spawn2 test).
>>>
>>> There is an inherent race condition in pipe2 usage for architectures that do not
>>> support the syscall directly.  In such cases the a pipe plus fctnl is used
>>> instead and it may lead to file descriptor leak in parent (as decribed by fcntl
>>> documentation).
>>>
>>> The child process stack is allocate with a mmap with MAP_STACK flag using
>>> default architecture stack size.  Although it is slower than use a stack buffer
>>> from parent, it allows some slack for the compatibility code to run scripts
>>> with no shebang (which may use a buffer with size depending of argument list
>>> count).
>>>
>>> Performance should be similar to the vfork default posix implementation and
>>> way faster than fork path (vfork on mostly linux ports are basically
>>> clone with CLONE_VM plus CLONE_VFORK).  The only difference is the syscalls
>>> required for the stack allocation/deallocation.
>>>
>>> It fixes BZ#10354, BZ#14750, and BZ#18433.
>>>
>>> Tested on i386, x86_64, powerpc64le, and aarch64.
>>>
>>
>> on aarch64 this caused
>>
>> FAIL: nptl/tst-exec1
>>
>> with error msg "join in thread in parent returned!?"
>>
>> i think the cause is that glibc clone sets
>> tcb.tid = tcb.pid = -1 if CLONE_VM|CLONE_VFORK
>> is used, which means the vfork child clobbers
>> the parent tcb.tid. (and pthread_join tests
>> tcb.tid and thus fails in tst-exec1.)
>> i believe all targets are affected.
> 
> I think the problem is glibc's vfork only changes the TCB's pid fields
> while fork changes both pid and tid in case on CLONE_VM.  Now I am trying
> to found out why exactly this difference handling of internal fields...

In fact I think clone.S should not reset the tid member in TCB for
fork(CLONE_VM|...) and it is what vfork does internally. It is 
used on INVALID_TD_P and INVALID_NOT_TERMINATED_TD_P macros used on
pthread_join and pthread_cancel. A value equal to -1 will issue
an error for such case, which mean that if a multithread program
issues a clone(CLONE_VM|...) from a posix_spawn, a subsequent
pthread_join might fail due invalid tid value.

And I think it is what is happening in your system on tst-exec1:
although pthread_create is issue before posix_spawn, I think 
posix_spawn clone might be executing before the pthread_join
in some cases leading to this issue.

I will propose a patch to modify clone to not reset the tid value.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]