Bug 27004 - ld.so is miscompiled by GCC 11
Summary: ld.so is miscompiled by GCC 11
Status: RESOLVED FIXED
Alias: None
Product: glibc
Classification: Unclassified
Component: dynamic-link (show other bugs)
Version: 2.33
: P2 normal
Target Milestone: 2.33
Assignee: Not yet assigned to anyone
URL:
Keywords:
: 27033 (view as bug list)
Depends on:
Blocks:
 
Reported: 2020-12-02 23:06 UTC by H.J. Lu
Modified: 2020-12-08 00:46 UTC (History)
5 users (show)

See Also:
Host:
Target: x86-64
Build:
Last reconfirmed:
fweimer: security-


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description H.J. Lu 2020-12-02 23:06:25 UTC
GCC 11 with

commit d5ac0401eb128bf3dadec943741dfde7c499e49a
Author: Haochen Gui <guihaoc@gcc.gnu.org>
Date:   Tue Nov 17 13:52:15 2020 -0600

    Relocatable read-only section support for absolute jump table

compiles _dl_lookup_symbol_x into

(gdb) r --direct
Starting program: /export/build/gnu/tools-build/glibc-gitlab/build-x86_64-linux/sunrpc/tst-getmyaddr --direct

Program received signal SIGSEGV, Segmentation fault.
_dl_lookup_symbol_x (undef_name=0x7ffff7ff416a "__vdso_clock_gettime", 
    undef_map=0x7ffff7ffe7b0, ref=0x7fffffffda98, symbol_scope=0x7ffff7ffeb48, 
    version=0x7fffffffdac0, type_class=0, flags=0, skip_map=0x0)
    at dl-lookup.c:929
929	      && add_dependency (undef_map, current_value.m, flags) < 0)
(gdb) disass
Dump of assembler code for function _dl_lookup_symbol_x:
   0x00007ffff7fdb8c0 <+0>:	push   %r15
   0x00007ffff7fdb8c2 <+2>:	push   %r14
   0x00007ffff7fdb8c4 <+4>:	push   %r13
   0x00007ffff7fdb8c6 <+6>:	push   %r12
   0x00007ffff7fdb8c8 <+8>:	mov    %rdi,%r12
   0x00007ffff7fdb8cb <+11>:	push   %rbp
   0x00007ffff7fdb8cc <+12>:	mov    %rdx,%rbp
   0x00007ffff7fdb8cf <+15>:	push   %rbx
=> 0x00007ffff7fdb8d0 <+16>:	mov    %fs:0x10,%rax
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ %fs isn't initialized yet.
   0x00007ffff7fdb8d9 <+25>:	sub    $0xa8,%rsp
   0x00007ffff7fdb8e0 <+32>:	mov    %rsi,0x10(%rsp)
   0x00007ffff7fdb8e5 <+37>:	mov    %rcx,0x20(%rsp)
   0x00007ffff7fdb8ea <+42>:	mov    %r8,0x8(%rsp)
   0x00007ffff7fdb8ef <+47>:	mov    %r9d,0x1c(%rsp)
   0x00007ffff7fdb8f4 <+52>:	mov    %rax,0x30(%rsp)
   0x00007ffff7fdb8f9 <+57>:	movzbl (%r12),%edx
   0x00007ffff7fdb8fe <+62>:	test   %dl,%dl
   0x00007ffff7fdb900 <+64>:	je     0x7ffff7fdbb40 <_dl_lookup_symbol_x+640>
   0x00007ffff7fdb906 <+70>:	mov    %r12,%rcx
   0x00007ffff7fdb909 <+73>:	mov    $0x1505,%ebx
   0x00007ffff7fdb90e <+78>:	xchg   %ax,%ax
   0x00007ffff7fdb910 <+80>:	mov    %rbx,%rax
--Type <RET> for more, q to quit, c to continue without paging--q
Quit
(gdb) b main
Breakpoint 1 at 0x4022f0: file ../support/test-driver.c, line 110.
(gdb) r
The program being debugged has been started already.
Start it from the beginning? (y or n) y
Starting program: /export/build/gnu/tools-build/glibc-gitlab/build-x86_64-linux/sunrpc/tst-getmyaddr --direct

Program received signal SIGSEGV, Segmentation fault.
_dl_lookup_symbol_x (undef_name=0x7ffff7ff416a "__vdso_clock_gettime", 
    undef_map=0x7ffff7ffe7b0, ref=0x7fffffffda98, symbol_scope=0x7ffff7ffeb48, 
    version=0x7fffffffdac0, type_class=0, flags=0, skip_map=0x0)
    at dl-lookup.c:929
929	      && add_dependency (undef_map, current_value.m, flags) < 0)
(gdb) bt
#0  _dl_lookup_symbol_x (undef_name=0x7ffff7ff416a "__vdso_clock_gettime", 
    undef_map=0x7ffff7ffe7b0, ref=0x7fffffffda98, symbol_scope=0x7ffff7ffeb48, 
    version=0x7fffffffdac0, type_class=0, flags=0, skip_map=0x0)
    at dl-lookup.c:929
#1  0x00007ffff7fd400f in dl_vdso_vsym (
    name=0x7ffff7ff416a "__vdso_clock_gettime")
    at ../sysdeps/unix/sysv/linux/dl-vdso.h:52
#2  setup_vdso_pointers () at ../sysdeps/unix/sysv/linux/dl-vdso-setup.h:30
#3  dl_main (phdr=<optimized out>, phnum=13, user_entry=<optimized out>, 
    auxv=0x7fffffffdfe8) at rtld.c:1620
#4  0x00007ffff7feac47 in _dl_sysdep_start (
    start_argptr=start_argptr@entry=0x7fffffffddf0, 
    dl_main=dl_main@entry=0x7ffff7fd2eb0 <dl_main>) at ../elf/dl-sysdep.c:252
#5  0x00007ffff7ff1fd5 in _dl_start_final (arg=0x7fffffffddf0) at rtld.c:485
#6  _dl_start (arg=0x7fffffffddf0) at rtld.c:578
#7  0x00007ffff7fd2058 in _start () at rtld.c:12
#8  0x0000000000000002 in ?? ()
#9  0x00007fffffffe145 in ?? ()
#10 0x00007fffffffe198 in ?? ()
#11 0x0000000000000000 in ?? ()
(gdb)
Comment 1 H.J. Lu 2020-12-02 23:51:33 UTC
commit 520d5ad337eaa15860a5a964daf7ca46cf31c029
Author: Jan Hubicka <jh@suse.cz>
Date:   Sat Nov 14 13:52:36 2020 +0100

    Detect EAF flags in ipa-modref
    
    A minimal patch for the EAF flags discovery.  It works only in local ipa-modref
    and gives up on cyclic SSA graphs.  It improves pt_solution_includes
    disambiguations twice.
    
is the first bad commit.
Comment 2 Jakub Jelinek 2020-12-03 11:24:03 UTC
Only lightly tested fix, i?86 will need similar change (with __seg_gs):

diff --git a/sysdeps/x86_64/nptl/tls.h b/sysdeps/x86_64/nptl/tls.h
index a08bf972de..ccb5f24d92 100644
--- a/sysdeps/x86_64/nptl/tls.h
+++ b/sysdeps/x86_64/nptl/tls.h
@@ -180,11 +180,16 @@ _Static_assert (offsetof (tcbhead_t, __glibc_unused2) == 0x80,
    assignments like
        pthread_descr self = thread_self();
    do not get optimized away.  */
-# define THREAD_SELF \
+# if __GNUC_PREREQ (6, 0)
+#  define THREAD_SELF \
+  (*(struct pthread *__seg_fs *) offsetof (struct pthread, header.self))
+# else
+#  define THREAD_SELF \
   ({ struct pthread *__self;                                                 \
      asm ("mov %%fs:%c1,%0" : "=r" (__self)                                  \
          : "i" (offsetof (struct pthread, header.self)));                    \
      __self;})
+# endif
 
 /* Magic for libthread_db to know how to do THREAD_SELF.  */
 # define DB_THREAD_SELF_INCLUDE  <sys/reg.h> /* For the FS constant.  */
Comment 3 H.J. Lu 2020-12-03 12:37:15 UTC
How about this

diff --git a/sysdeps/x86_64/nptl/tls.h b/sysdeps/x86_64/nptl/tls.h
index a08bf972de..4eeab0e7a9 100644
--- a/sysdeps/x86_64/nptl/tls.h
+++ b/sysdeps/x86_64/nptl/tls.h
@@ -180,11 +180,19 @@ _Static_assert (offsetof (tcbhead_t, __glibc_unused2) == 0x80,
    assignments like
 	pthread_descr self = thread_self();
    do not get optimized away.  */
-# define THREAD_SELF \
+# if __GNUC_PREREQ (11, 0)
+#  define THREAD_SELF \
+  ({ struct pthread *__self;						      \
+     __self = (struct pthread *) (__builtin_thread_pointer ()		      \
+				  + offsetof (struct pthread, header.self));  \
+     __self;})
+# else
+#  define THREAD_SELF \
   ({ struct pthread *__self;						      \
      asm ("mov %%fs:%c1,%0" : "=r" (__self)				      \
 	  : "i" (offsetof (struct pthread, header.self)));	 	      \
      __self;})
+# endif
 
 /* Magic for libthread_db to know how to do THREAD_SELF.  */
 # define DB_THREAD_SELF_INCLUDE  <sys/reg.h> /* For the FS constant.  */
Comment 4 Florian Weimer 2020-12-03 12:43:16 UTC
__builtin_thread_pointer is potentially trapping (which is the root of the problem). Does GCC know about that? It also results in a load followed by an add, instead of a single load, I think. Using the __segfs or __seggs namespaces looks preferable to me.
Comment 5 Jakub Jelinek 2020-12-03 12:51:00 UTC
(In reply to Florian Weimer from comment #4)
> __builtin_thread_pointer is potentially trapping (which is the root of the
> problem). Does GCC know about that? It also results in a load followed by an
> add, instead of a single load, I think. Using the __segfs or __seggs
> namespaces looks preferable to me.

GCC makes it const __attribute__((nothrow)), and that is I think ok except in the dynamic linker.

I don't think
+     __self = (struct pthread *) (__builtin_thread_pointer ()		      \
+				  + offsetof (struct pthread, header.self));  \
does what the old code did, which was movq %fs:16, __self
but your version is movq %fs:0, %reg; leaq 16(%reg), __self
That would be *(struct pthread **) (__builtin_thread_pointer () + offsetof (struct pthread, header.self))
if we optimize that back to just movq %fs:16, __self
But, as it is for x86 GCC 11+ only, isn't it better to use __seg_{f,g}s that should work already since GCC 6?
Comment 6 H.J. Lu 2020-12-03 12:51:36 UTC
(In reply to Florian Weimer from comment #4)
> __builtin_thread_pointer is potentially trapping (which is the root of the
> problem). Does GCC know about that? It also results in a load followed by an
> add, instead of a single load, I think. Using the __segfs or __seggs
> namespaces looks preferable to me.

OK. Shouldn't all %fs references be replaced by __seg_fs?
Comment 7 Florian Weimer 2020-12-03 12:52:32 UTC
Fixed for glibc 2.33 by:

commit 1d9cbb96082e646de7515a1667efa041ffb79958
Author: Jakub Jelinek <jakub@redhat.com>
Date:   Thu Dec 3 13:33:44 2020 +0100

    x86: Fix THREAD_SELF definition to avoid ld.so crash (bug 27004)
    
    The previous definition of THREAD_SELF did not tell the compiler
    that %fs (or %gs) usage is invalid for the !DL_LOOKUP_GSCOPE_LOCK
    case in _dl_lookup_symbol_x.  As a result, ld.so could try to use the
    TCB before it was initialized.
    
    As the comment in tls.h explains, asm volatile is undesirable here.
    Using the __seg_fs (or __seg_gs) namespace does not interfere with
    optimization, and expresses that THREAD_SELF is potentially trapping.
Comment 8 Florian Weimer 2020-12-03 12:53:44 UTC
(In reply to H.J. Lu from comment #6)
> OK. Shouldn't all %fs references be replaced by __seg_fs?

Yes, I see we require GCC 6.2 as a minimum. So the __GLIBC_PREREQ is actually unnecessary.
Comment 9 H.J. Lu 2020-12-08 00:46:03 UTC
*** Bug 27033 has been marked as a duplicate of this bug. ***