use of %fs segment register in x86_64 with -fstack-check

Ruslan Kabatsayev b7.10110111@gmail.com
Tue Mar 3 20:03:00 GMT 2020


On Tue, 3 Mar 2020 at 21:37, Maxim Blinov <maxim.blinov@embecosm.com> wrote:
>
> Hi Ruslan, thankyou for your explanations. Unfortunately, I still
> can't see the whole picture.
>
> On Tue, 3 Mar 2020 at 16:51, Ruslan Kabatsayev <b7.10110111@gmail.com> wrote:
> > Not quite. As noted at [1] this OR is to ensure that stack hasn't
> > overflowed. This is the part added by -fstack-check (you can see it go
> > away when you remove this option). See [2] for documentation.
>
> I don't understand how the OR insns check that the stack hasn't overflowed.
>
> From [1], the author writes "it just inserts a NULL byte". What is
> *it* in this context? I don't see anyone writing anything to the stack
> in the assembly. Does linux do it on our behalf, and then the OR insns
> check that those bytes are indeed NULL?
>
> Furthermore, I can't see who uses the result of the OR operation. I'm
> under the impression that there is some page fault magic happening
> under the hood, but what is that magic? No insns after the ORs perform
> any conditional jumps based on the ORs results that I can see
> (although I am not very knowledgeable about x86_64 asm.) So I am still
> confused.
>
> I did read [2] before posting, but unfortunately I didn't find it very helpful.
>
> I tried to step through each insn in my head to demonstrate where i dont get it:
>
> 0x555555554560 <main>           sub    $0x2f78,%rsp
> Ok, whatever %rsp was, its now %rsp - 12152. Thats a lot more than
> 8000, but fine.
> Lets call %rsp before we subtracted it "%original".
>
> 0x555555554567 <main+7>         orq    $0x0,0xf58(%rsp)
> Ok, we OR with memory location %rsp + 3928. Taking into account the
> previous offset, we're accessing %original + (3928 - 12152) which is
> %original - 8224. So this is about 200 bytes after the stack array
> ends. The instruction doesn't change the value at 0xf58(%rsp). My
> understanding is that this instruction will fetch the quadword at
> 0xf58(%rsp), OR it with $0x0, and then store the result of that
> computation back to the same address. How does this check that no
> stack overflow has occurred?
>
> 0x555555554570 <main+16>        orq    $0x0,(%rsp)
> We do it again, this time at %original - 12152 (the bottom of the
> stack). Is this because we might span over two pages?

Not merely "might", we _do_ span two pages. Pages are 4096 bytes in size.

>
> 0x555555554575 <main+21>        add    $0x1020,%rsp
> Now we set %rsp to be %original - 8024. So now we are actually
> pointing to the stack byte just after the large array.
>
> 0x55555555457c <main+28>        mov    %rsp,%rdi
> Now we save %rsp to %rdi, despite %rdi not being used anywhere... not
> sure about this one.

Actually it _is_ used—in the callee. That's how the first integral
argument is passed, see System V x86-64 psABI for more details. So RSP
(and EDI) now contains the address of the first byte of the array.

>
> 0x55555555457f <main+31>        mov    %fs:0x28,%rax
> Load the magic sentinel pattern, OK.
>
> 0x555555554588 <main+40>        mov    %rax,0x1f48(%rsp)
> 0x1f48 corresponds to %original - 16. So we are writing a sentinel
> value to almost the start of the stack for this func.
>
> 0x555555554590 <main+48>        xor    %eax,%eax
> 0x555555554592 <main+50>        callq  0x5555555546d0 <foo>
>
> Clear %eax for foo's return value and call foo.

No, it's not clearing for the return value. The return type of foo is
void, so this must be something other. I'd guess it's clearing the
sentinel value so that foo doesn't have easy access to it. Otherwise
it could somehow (e.g. due to an uninitialized variable) be written by
foo into the area being protected, which would defy the protector's
efforts, since stack smashing will then not be detected.

>
> 0x555555554597 <main+55>        mov    0x1f48(%rsp),%rdx
> 0x55555555459f <main+63>        xor    %fs:0x28,%rdx
> 0x5555555545a8 <main+72>        jne    0x5555555545b4 <main+84>
>
> Now we double-check that the sentinel value at %original - 16 is
> exactly the same as it was before we called foo, and if it isn't, we
> go to __stack_chk_fail. So, this protects us against the case where
> foo trashed the start of our stack?

Yes, this protects us from the case when buffer overrun overwrites
return address and thus possibly lands us somewhere at malicious (if
this buffer overrun is being exploited) code at return.

>
> 0x5555555545aa <main+74>        xor    %eax,%eax
> 0x5555555545ac <main+76>        add    $0x1f58,%rsp
> 0x5555555545b3 <main+83>        retq
>
> Clear our own return value, cleanup the stack, and exit.
>
> I just don't understand how the ORs are ensuring the stack hasn't overflowed.

I think this is supposed to ensure that, as you've grown stack to some
large size (by RSP subtraction), the whole allocated space actually
belongs to the stack. Otherwise, you could e.g. grow it by 2GiB, write
to the newly-allocated space—and clobber heap, not noticing the gap
under the lowest stack location. These ORs will ensure that this gap
is noticed (and gets you SIGSEGV).

>
> > Right. But note that this is enabled not by -fstack-check, but rather
> > by some of the -fstack-protector* options that are on by default on
> > modern Linux distributions. You can confirm this by explicitly passing
> > -fno-stack-protector and seeing this sentinel checking gone.
>
> Ok, I see.
>
> > The FS segment base points to the TLS. See [3] and links therein.
> ...
> > It's the offset of stack_guard member of tcbhead_t. See the
> > corresponding glibc source [4].
>
> Got it, thankyou.
>
> > [1]: https://stackoverflow.com/a/44670648/673852
> > [2]: https://gcc.gnu.org/onlinedocs/gccint/Stack-Checking.html
> > [3]: https://chao-tic.github.io/blog/2018/12/25/tls
> > [4]: https://code.woboq.org/userspace/glibc/sysdeps/x86_64/nptl/tls.h.html#42
> >
> > Regards,
> > Ruslan



More information about the Gdb mailing list