Bug 25680 - ifuncmain9picstatic and ifuncmain9picstatic crash in IFUNC resolver due to stack canary (--enable-stack-protector=all)
Summary: ifuncmain9picstatic and ifuncmain9picstatic crash in IFUNC resolver due to st...
Status: RESOLVED FIXED
Alias: None
Product: glibc
Classification: Unclassified
Component: dynamic-link (show other bugs)
Version: 2.31
: P3 minor
Target Milestone: 2.34
Assignee: Nick Alcock
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2020-03-15 10:15 UTC by Sergei Trofimovich
Modified: 2021-03-15 16:37 UTC (History)
5 users (show)

See Also:
Host: i686-pc-linux-gnu
Target:
Build: x86_64-pc-linux-gnu
Last reconfirmed: 2020-04-05 00:00:00
fw: security-


Attachments
inhibit_stack_protector_patch (389 bytes, patch)
2020-06-25 22:37 UTC, Dave Hughes
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Sergei Trofimovich 2020-03-15 10:15:01 UTC
Bug is originally reported as a https://bugs.gentoo.org/712356.

In this case the following tests fail:

    FAIL: elf/ifuncmain9picstatic
    FAIL: elf/ifuncmain9static

The crash seem to happen at a point when we access TLS canary before TLS segment is initialized (yes?)

$ /tmp/portage/sys-libs/glibc-9999:gdb --quiet --args work/build-x86-x86_64-pc-linux-gnu-nptl/elf/ifuncmain9static
Reading symbols from work/build-x86-x86_64-pc-linux-gnu-nptl/elf/ifuncmain9static...
(gdb) run
Starting program: /tmp/portage/sys-libs/glibc-9999/work/build-x86-x86_64-pc-linux-gnu-nptl/elf/ifuncmain9static

Program received signal SIGSEGV, Segmentation fault.
0xf7f5a0cd in resolver () at ifuncmain9.c:47
47	{
(gdb) bt
#0  0xf7f5a0cd in resolver () at ifuncmain9.c:47
#1  0xf7f8ddc2 in elf_machine_rel (skip_ifunc=0, reloc_addr_arg=0xf7ffb098 <*ABS*@got.plt>, version=0x0, sym=0xf7f561ac, reloc=0xf7f58b80,
    map=0xf7ffba80 <_dl_main_map>) at ../sysdeps/i386/dl-machine.h:484
#2  elf_dynamic_do_Rel (skip_ifunc=0, lazy=0, nrelative=<optimized out>, relsize=<optimized out>, reladdr=<optimized out>, map=<optimized out>)
    at do-rel.h:170
#3  _dl_relocate_static_pie () at dl-reloc-static-pie.c:49
#4  0xf7f5a4f8 in __libc_start_main (main=0xf7f59930 <main>, argc=1, argv=0xffffca14, init=0xf7f5af20 <__libc_csu_init>,
    fini=0xf7f5afc0 <__libc_csu_fini>, rtld_fini=0x0, stack_end=0xffffca0c) at ../csu/libc-start.c:144
#5  0xf7f59f62 in _start () at ../sysdeps/i386/start.S:113
(gdb) disassemble
Dump of assembler code for function resolver:
   0xf7f5a0c0 <+0>:	call   0xf7f5a105 <__x86.get_pc_thunk.ax>
   0xf7f5a0c5 <+5>:	add    $0xa0f3b,%eax
   0xf7f5a0ca <+10>:	sub    $0x1c,%esp
=> 0xf7f5a0cd <+13>:	mov    %gs:0x14,%ecx
   0xf7f5a0d4 <+20>:	mov    %ecx,0xc(%esp)
   0xf7f5a0d8 <+24>:	xor    %ecx,%ecx
   0xf7f5a0da <+26>:	mov    0x1304(%eax),%edx
   0xf7f5a0e0 <+32>:	add    $0x1,%edx
   0xf7f5a0e3 <+35>:	mov    %edx,0x1304(%eax)
   0xf7f5a0e9 <+41>:	mov    0xc(%esp),%ecx
   0xf7f5a0ed <+45>:	sub    %gs:0x14,%ecx
   0xf7f5a0f4 <+52>:	jne    0xf7f5a100 <resolver+64>
   0xf7f5a0f6 <+54>:	lea    -0xa0f90(%eax),%eax
   0xf7f5a0fc <+60>:	add    $0x1c,%esp
   0xf7f5a0ff <+63>:	ret
   0xf7f5a100 <+64>:	call   0xf7f8ab70 <__stack_chk_fail>
End of assembler dump.

glibc was build with the following configure options:

 *       Manual CC:   x86_64-pc-linux-gnu-gcc -m32
 * Running do_src_configure for ABI x86
 * Configuring glibc for nptl
 *             ABI:   x86
 *          CBUILD:   x86_64-pc-linux-gnu
 *           CHOST:   x86_64-pc-linux-gnu
 *         CTARGET:   x86_64-pc-linux-gnu
 *      CBUILD_OPT:   i686-pc-linux-gnu
 *     CTARGET_OPT:   i686-pc-linux-gnu
 *              CC:   x86_64-pc-linux-gnu-gcc -m32
 *             CXX:
 *              LD:
 *         ASFLAGS:
 *          CFLAGS:   -march=sandybridge -mtune=sandybridge -pipe -fdiagnostics-show-option -Wall -Wextra -Wstack-protector -g -O2
 *        CPPFLAGS:
 *        CXXFLAGS:   -march=sandybridge -mtune=sandybridge -pipe -fdiagnostics-show-option -Wall -Wextra -Wstack-protector -g -O2
 *         LDFLAGS:   -Wl,-O1 -Wl,--as-needed -Wl,--hash-style=gnu
 *        MAKEINFO:   /dev/null
 *       Manual CC:   x86_64-pc-linux-gnu-gcc -m32 -march=sandybridge -mtune=sandybridge -pipe -fdiagnostics-show-option -Wall -Wextra -Wstack-protector -g -O2 -Wl,-O1 -Wl,--as-needed -Wl,--hash-style=gnu
 *      Manual CXX:   x86_64-pc-linux-gnu-g++ -m32 -march=sandybridge -mtune=sandybridge -pipe -fdiagnostics-show-option -Wall -Wextra -Wstack-protector -g -O2

/tmp/portage/sys-libs/glibc-9999/work/glibc-9999/configure --enable-stack-protector=all --enable-stackguard-randomization --disable-cet --enable-kernel=3.2.0 --without-selinux --without-cvs --disable-werror --enable-bind-now --build=i686-pc-linux-gnu --host=i686-pc-linux-gnu --disable-profile --without-gd --with-headers=/usr/include --prefix=/usr --sysconfdir=/etc --localstatedir=/var --libdir=$(prefix)/lib --mandir=$(prefix)/share/man --infodir=$(prefix)/share/info --libexecdir=$(libdir)/misc/glibc --with-bugurl=https://bugs.gentoo.org/ --with-pkgversion=Gentoo 9999 p16 --enable-crypt --enable-static-pie --disable-systemtap --disable-nscd --disable-timezone-tools
Comment 1 Sergei Trofimovich 2020-03-15 10:18:09 UTC
I'm not sure what requirements for IFUNC resolver() there are.

Should it always run after canary is initialized? Or should resolver function always be marked as exempt from stack protection with 'inhibit_stack_protector' or similar?
Comment 2 Florian Weimer 2020-03-15 10:39:17 UTC
You should not use --enable-stack-protector=all, but --enable-stack-protector=strong.  “all” is very unlikely to add additional protection. If you do that, the IFUNC resolver will not be instrumented (because it has no local addressable variables, so no stack buffer overflow is possible), so the bug goes away.

I'm not sure if we can or should order the startup sequence for statically linked binaries differently, so I'm not closing this bug.
Comment 3 Sergei Trofimovich 2020-03-15 12:18:35 UTC
(In reply to Florian Weimer from comment #2)
> You should not use --enable-stack-protector=all, but
> --enable-stack-protector=strong.  “all” is very unlikely to add additional
> protection. If you do that, the IFUNC resolver will not be instrumented
> (because it has no local addressable variables, so no stack buffer overflow
> is possible), so the bug goes away.
> 
> I'm not sure if we can or should order the startup sequence for statically
> linked binaries differently, so I'm not closing this bug.

Oh, that's surprising to hear. Gentoo will have to switch the default from =all to =strong then.

Does it mean --enable-stack-protector=all is not really a supported option for glibc? If' it's not I can hack a patch that complains loudly about the option being experimental.

My main worry is that resolver in ifuncmain9picstatic's source code only happens to be a part of glibc. It could have been an external program. Any external program might decide to enable -fstack-protector-all and will get non-working resolver as a result. Is there a doc that describes this and other limitations imposed on IFUNC resolves by glibc? I wonder what else we break by accident.
Comment 4 Florian Weimer 2020-03-15 12:25:58 UTC
(In reply to Sergei Trofimovich from comment #3)
> Does it mean --enable-stack-protector=all is not really a supported option
> for glibc? If' it's not I can hack a patch that complains loudly about the
> option being experimental.

It is supported in the sense that it is there and you can build glibc with it, but some edge cases in the test suite have not been covered. (At this time, I'm not sure if this is a test bug or a bug in the static initialization sequence, though.)

> My main worry is that resolver in ifuncmain9picstatic's source code only
> happens to be a part of glibc. It could have been an external program. Any
> external program might decide to enable -fstack-protector-all and will get
> non-working resolver as a result. Is there a doc that describes this and
> other limitations imposed on IFUNC resolves by glibc? I wonder what else we
> break by accident.

It's a valid concern. We probably have to reorder the initialization sequence because on some architecture (notably powerpc as of commit 67385a01d229751569b6aac067ffdcd813a15d7a ("powerpc: Add hwcap/hwcap2/platform data to TCB."), IFUNC resolvers are expected to access the TCB to choose the appropriate implementation.
Comment 5 Nick Alcock 2020-04-05 19:17:41 UTC
I can confirm this. IFUNC resolvers should be marked with inhibit_stack_protector: or, at least, when I added --enable-stack-protector=all, I marked all then present in glibc that way (and nobody said it was a bad idea). I don't see any other way to do it without radical reorganization of ld.so, for more or less no gain. IFUNC resolvers have lots of obscure requirements on them anyway: this is just another one.

I'll submit a patch fixing this in the next few days (probably next weekend, over Easter) -- though the patch is so trivial that anyone else who's already got a build-many-glibcs setup that doesn't need major surgery to revive it (unlike me) is welcome to come up with it earlier :)

(taken.)
Comment 6 Nick Alcock 2020-04-05 19:20:22 UTC
(btw, I note that --enable-stack-protector=all *has* spotted two bugs in glibc that no lower level has identified. Another way of putting this, of course, is that it's only spotted two -- and one of them would have been entirely harmless without the canary breaking things :) Personally, I run with =strong everywhere except for firewalls, which get =all.)
Comment 7 Dave Hughes 2020-06-25 22:37:24 UTC
Created attachment 12663 [details]
inhibit_stack_protector_patch

Patch posted to libc-alpha ML and also attached here. I went with Nick's advice with inhibit_stack_protector like he done previously. Tests no longer failing.
Comment 8 Sourceware Commits 2021-03-15 16:25:58 UTC
The master branch has been updated by Siddhesh Poyarekar <siddhesh@sourceware.org>:

https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=03f42a56eb4e88601ebb334787c8198156197b29

commit 03f42a56eb4e88601ebb334787c8198156197b29
Author: David Hughes <davidhughes205@gmail.com>
Date:   Mon Mar 15 20:23:39 2021 +0530

    Add inhibit_stack_protector to ifuncmain9 [BZ #25680]
    
    Enabling --enable-stack-protector=all causes the following tests to fail:
    
        FAIL: elf/ifuncmain9picstatic
        FAIL: elf/ifuncmain9static
    
    Nick Alcock (who committed the stack protector code) marked the IFUNC
    resolvers with inhibit_stack_protector when he done the original work and
    suggested doing so again @ BZ #25680. This patch adds
    inhibit_stack_protector to ifuncmain9.
    
    After patch is applied, --enable-stack-protector=all does not fail the
    above tests.
    
    Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Comment 9 Siddhesh Poyarekar 2021-03-15 16:36:33 UTC
Fix is in mainline, thanks for the patch Dave!

I've also created bug 27582 because it is possible (but just a bit more work to vwrite and verify than I can put in at the moment) to make this test non-flaky with stack-protector-all.