This is the mail archive of the
libc-alpha@sourceware.org
mailing list for the glibc project.
Re: [PATCH] posix: Optimize Linux posix_spawn
On 08/05/2019 15:37, Adhemerval Zanella wrote:
>
>
> On 08/05/2019 15:03, Florian Weimer wrote:
>> * Adhemerval Zanella:
>>
>>> On 08/05/2019 12:26, Florian Weimer wrote:
>>>> * Adhemerval Zanella:
>>>>
>>>>>> With an on-stack array for the stack, it looks like this:
>>>>>>
>>>>>> +-----------------------------
>>>>>> | Activation record for f1
>>>>>> |
>>>>>> | [ Array
>>>>>> | [ +-----------------------------
>>>>>> | [ | Activation record for f2
>>>>>> | [ |
>>>>>> | [ |
>>>>>> |
>>>>>> | (*)
>>>>>> |
>>>>>> :
>>>>>> +=== Guard ===================
>>>>>>
>>>>>>
>>>>>> The problem is that technically, the activation record of f1 continues
>>>>>> after the array, indicated by (*), and we won't know what the compiler
>>>>>> will store there. This means that even if f2 is built with
>>>>>> -fstack-clash-protection, stack overflow detection is unreliable because
>>>>>> the overflow into f1's activation record does not hit a guard page.
>>>>>
>>>>> Thanks for explanation, but I still failing to see exactly why even with
>>>>> -fstack-clash-protection f2 probing will fail to hit the guard page.
>>>>>
>>>>> The f2 probing will query page size based on stack address based on f1 stack
>>>>> address, so the action record (which I assume you meant to be the prologue
>>>>> probe based on stack allocation added by compiler) a at f2 will essentially
>>>>> probe f1 stack memory. Following your diagram, a potentially stack overflow
>>>>> on f2 would do:
>>>>>
>>>>> +-----------------------------
>>>>> | Activation record for f1
>>>>> |
>>>>> | [ Array
>>>>> | [ +-----------------------------
>>>>> | [ | Activation record for f2
>>>>> | [ |
>>>>> | [ |
>>>>> |
>>>>> | (f1act)
>>>>> |
>>>>> :
>>>>> +=== Guard ===================
>>>>> : (f2act)
>>>>>
>>>>> My understanding is even -fstack-protector would work as well, since a
>>>>> buffer overrun on f2 will still overwrite f1 canary. I am not seeing why
>>>>> this specific *clone* usage is different than a function call and if it
>>>>> is indeed it might be troublesome since it s a potentially vector of
>>>>> attack.
>>>>
>>>> The problem is not the case where f2 actually hits the guard page. You
>>>> are correct that this is detected as before.
>>>>
>>>> What is not detected is a partial overflow that hits the activation
>>>> record of f1 *outside the array*, but stops short of extending to the
>>>> guard page. It's impossible to tell what will happen in this case
>>>> because we do not know the frame layout that GCC has produced for f1.
>>>
>>> But would in this case protected as well with -fstack-protector, where
>>> f2 would overwrite f1 canary?
>>
>> No, the canary is at the top of the activation record, and it would not
>> be checked upon return from f2 anyway.
>
> But f2 canary would be checked anyway (if it is also built with
> stack hardening). I give you that it still might be an issue for
> a vfork interface which might issue user provided callbacks, but I
> am not sure the really net gain for this specific scenario.
>
>>
>>>> The issue can only be avoided reliably by writing f1 in assembler, I
>>>> think. INTERNAL_SYSCALL is probably not safe for vfork.
>>>
>>> In any case, I still think the specific scenario is quite an overkill
>>> one to block this optimization. Specially because the possible adjustments
>>> is somewhat infeasible (rewrite the helper spawn function in assembly,
>>> or the impossibility to use INTERNAL_SYSCALL which may itself cause a
>>> stack allocation) or add a arch-specific vfork variant to handle this
>>> specific case.
>>
>> I do not have a strong opinion regarding this matter. I'd still prefer
>> using vfork here. I really don't understand your reluctance in this
>> matter.
>
> Besides the optimization, I am not really convinced that the security gain
> is really expressive in this case, the current scenario that invokes
> mmap is overkill for specific usage, since there no guard page current
> scenario is not really better security-wise, and the other scenarios
> you are suggestion adds even more complexity.
Also, for current case on posix_spawn it is a fixed code path with fixed
stack usage in help process, so assuming a large enough stack buffer
(which I will double check with -fstack-usage) it would never overflow.
It also does not prevent to adapt to the vfork-like interface you are
suggesting once it is implemented (which does seems appealing after
this discussion).