This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH] posix: Optimize Linux posix_spawn


* Adhemerval Zanella:

> On 07/05/2019 05:33, Florian Weimer wrote:
>> * Adhemerval Zanella:
>> 
>>> The current internal posix_spawn symbol for Linux (__spawni) requires
>>> to allocate a dynamic stack based on input arguments to handle the
>>> SPAWN_XFLAGS_USE_PATH internal flag, which re-issue the input binary
>>> as a shell script if execve call return ENOEXEC (to execute shell
>>> scripts with an initial shebang).
>>>
>>> This is done only for compatibility mode and the generic case does not
>>> require the extra calculation plus the potential large mmap/munmap
>>> call.  For default case, a pre-defined buffer is sufficed to use on the
>>> clone call instead.
>>>
>>> This patch optimizes Linux spawni by allocating a dynamic stack only
>>> for compatibility symbol (SPAWN_XFLAGS_USE_PATH).  For generic case,
>>> a stack allocated buffer is used instead.
>> 
>> I'm still not convinced that stack switching is required here.
>
> - Performance-wise it would be just a stack increment (which tends to
>   better than a mmap/unmap), 
>
> - It also should work with any stack hardening (--enable-stack-protector=all
>   shows no issue) and the stack uses uses the calling process stack as well
>   (which should trigger overflow due guard pages).

No, since there is no guard page as part of the on-stack array, we do
not detect any overflow into other parts of the stack frame if it
happens to be stored below the frame.  So this still disables some of
the hardening checks.

>   It also should issue the stackoverflow earlier since it allocated the
>   stack prior helper process usage.

That is true, but we won't know if the region is large enough.

> - We do not enable cancellation, neither its uses PLT resolution, so stack
>   usage should be limited to around the functions usage (such as the execvpe
>   provision).

Hmm.  Right, we should not trigger lazy binding if we applied PLT
avoidance successfully.  So this is not a problem here.

> - It does not require a arch-specific way to query the stack pointer 
>   (__builtin_frame_address (0) should work, but it would require some testing
>   to see it is safe on all supported architectures and at least for ia64 we
>   would need to set up a artificial maximum size which might overflow depending of
>   the stack usage on the call).

It's also not clear to me how we could compute a safe stack adjustment
from C code.

> I am more inclined to last point, CLONE would be a function call and it will
> itself require some stack allocation in most ABIs. To add a real-safe clone
> call, we will need to add a variant of clone which uses the current stack
> pointer on clone call itself, not as an argument. And I really think this
> is overkill for this change, taking on consideration the little gain for this
> specific usage and required work.

I agree that this work should be undertaken as part of a clone_samestack
or similar implementation.

Thanks,
Florian


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]