Bug 14370 - ld.so crashes on mismatched TLS/non-TLS symbols
Summary: ld.so crashes on mismatched TLS/non-TLS symbols
Status: REOPENED
Alias: None
Product: glibc
Classification: Unclassified
Component: dynamic-link (show other bugs)
Version: 2.16
: P2 normal
Target Milestone: ---
Assignee: Not yet assigned to anyone
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2012-07-18 10:05 UTC by Pawel Sikora
Modified: 2023-07-30 16:53 UTC (History)
9 users (show)

See Also:
Host:
Target:
Build:
Last reconfirmed:
fweimer: security-


Attachments
testcase (193.83 KB, application/octet-stream)
2012-07-18 10:05 UTC, Pawel Sikora
Details
A patch (959 bytes, patch)
2012-09-04 21:32 UTC, H.J. Lu
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Pawel Sikora 2012-07-18 10:05:00 UTC
Created attachment 6536 [details]
testcase

$ ldd -r SceMiDpiBridge.so
        linux-gate.so.1 (0xf76eb000)
        libsds_server.so => not found
        libxtor_threads_boost.so => not found
        libaldecpli.so => not found
        libsvdpi_exp.so => not found
        libstdc++.so.6 => /usr/lib/libstdc++.so.6 (0xf755e000)
        libsce_mi.so => not found
        libScemiDpiBridgeApi.so => not found
        libm.so.6 => /lib/libm.so.6 (0xf7520000)
        libgcc_s.so.1 => /lib/libgcc_s.so.1 (0xf7505000)
        librt.so.1 => /lib/librt.so.1 (0xf74fb000)
        libc.so.6 => /lib/libc.so.6 (0xf735b000)
        /lib/ld-linux.so.2 (0xf76ec000)
        libpthread.so.0 => /lib/libpthread.so.0 (0xf7340000)
Floating point exception
Comment 1 Andreas Jaeger 2012-07-18 10:10:48 UTC
Do you have a small test case (source code!) that shows the problem?
Comment 2 Pawel Sikora 2012-07-18 10:48:54 UTC
(In reply to comment #1)
> Do you have a small test case (source code!) that shows the problem?

this is a 3rd-party binary element w/o sources but in the dmesg
i see a trap error:

[61063.240729] ld-linux.so.2[7407] trap divide error ip:f7791c17 sp:ffcd33e0 error:0 in ld-2.16.so[f7787000+1f000]

which points to divide by zero in:

0000abb0 <_dl_try_allocate_static_tls>:
(...)
    ac17:       f7 f1                   div    %ecx
Comment 3 Jakub Jelinek 2012-07-18 10:54:33 UTC
Can you paste output of:
readelf -WSl SceMiDpiBridge.so 
?  Sounds like it must have broken PT_TLS alignment (which would be not be a glibc bug).
Comment 4 Pawel Sikora 2012-07-18 10:59:55 UTC
(In reply to comment #3)
> Can you paste output of:
> readelf -WSl SceMiDpiBridge.so 
> ?  Sounds like it must have broken PT_TLS alignment (which would be not be a
> glibc bug).

$ readelf -WSl SceMiDpiBridge.so
There are 34 section headers, starting at offset 0x98d88:

Section Headers:
  [Nr] Name              Type            Addr     Off    Size   ES Flg Lk Inf Al
  [ 0]                   NULL            00000000 000000 000000 00      0   0  0
  [ 1] .hash             HASH            000000d4 0000d4 002104 04   A  2   0  4
  [ 2] .dynsym           DYNSYM          000021d8 0021d8 004380 10   A  3  18  4
  [ 3] .dynstr           STRTAB          00006558 006558 003dc7 00   A  0   0  1
  [ 4] .gnu.version      VERSYM          0000a320 00a320 000870 02   A  2   0  2
  [ 5] .gnu.version_r    VERNEED         0000ab90 00ab90 000060 00   A  3   2  4
  [ 6] .rel.dyn          REL             0000abf0 00abf0 012ee0 08   A  2   0  4
  [ 7] .rel.plt          REL             0001dad0 01dad0 000358 08   A  2   9  4
  [ 8] .init             PROGBITS        0001de28 01de28 000017 00  AX  0   0  4
  [ 9] .plt              PROGBITS        0001de40 01de40 0006c0 04  AX  0   0  4
  [10] .text             PROGBITS        0001e500 01e500 05aaf0 00  AX  0   0 16
  [11] __libc_freeres_fn PROGBITS        00078ff0 078ff0 0004e4 00  AX  0   0 16
  [12] .fini             PROGBITS        000794d4 0794d4 00001b 00  AX  0   0  4
  [13] .rodata           PROGBITS        00079500 079500 016b70 00   A  0   0 32
  [14] __libc_atexit     PROGBITS        00090070 090070 000004 00   A  0   0  4
  [15] __libc_subfreeres PROGBITS        00090074 090074 00002c 00   A  0   0  4
  [16] .eh_frame_hdr     PROGBITS        000900a0 0900a0 000dfc 00   A  0   0  4
  [17] .eh_frame         PROGBITS        00090e9c 090e9c 003a78 00   A  0   0  4
  [18] .gcc_except_table PROGBITS        00094914 094914 0004a4 00   A  0   0  4
  [19] .ctors            PROGBITS        00095000 095000 00002c 00  WA  0   0  4
  [20] .dtors            PROGBITS        0009502c 09502c 000018 00  WA  0   0  4
  [21] .jcr              PROGBITS        00095044 095044 000004 00  WA  0   0  4
  [22] .data.rel.ro      PROGBITS        00095060 095060 000380 00  WA  0   0 32
  [23] .dynamic          DYNAMIC         000953e0 0953e0 000108 08  WA  3   0  4
  [24] .got              PROGBITS        000954e8 0954e8 0000d8 04  WA  0   0  4
  [25] .got.plt          PROGBITS        000955c0 0955c0 0001b8 04  WA  0   0  4
  [26] .data             PROGBITS        00095780 095780 000f90 00  WA  0   0 32
  [27] .bss              NOBITS          00096720 096710 0014ac 00  WA  0   0 32
  [28] __libc_freeres_ptrs NOBITS          00097bcc 096710 000014 00  WA  0   0  4
  [29] .comment          PROGBITS        00000000 096710 002424 00      0   0  1
  [30] .gnu.warning.llseek PROGBITS        00000000 098b40 00003f 00      0   0 32
  [31] .gnu.warning.sys_errlist PROGBITS        00000000 098b80 000044 00      0   0 32
  [32] .gnu.warning.sys_nerr PROGBITS        00000000 098be0 000041 00      0   0 32
  [33] .shstrtab         STRTAB          00000000 098c21 000167 00      0   0  1
Key to Flags:
  W (write), A (alloc), X (execute), M (merge), S (strings)
  I (info), L (link order), G (group), T (TLS), E (exclude), x (unknown)
  O (extra OS processing required) o (OS specific), p (processor specific)

Elf file type is DYN (Shared object file)
Entry point 0x1e500
There are 5 program headers, starting at offset 52

Program Headers:
  Type           Offset   VirtAddr   PhysAddr   FileSiz MemSiz  Flg Align
  LOAD           0x000000 0x00000000 0x00000000 0x94db8 0x94db8 R E 0x1000
  LOAD           0x095000 0x00095000 0x00095000 0x01710 0x02be0 RW  0x1000
  DYNAMIC        0x0953e0 0x000953e0 0x000953e0 0x00108 0x00108 RW  0x4
  GNU_EH_FRAME   0x0900a0 0x000900a0 0x000900a0 0x00dfc 0x00dfc R   0x4
  GNU_STACK      0x000000 0x00000000 0x00000000 0x00000 0x00000 RW  0x4

 Section to Segment mapping:
  Segment Sections...
   00     .hash .dynsym .dynstr .gnu.version .gnu.version_r .rel.dyn .rel.plt .init .plt .text __libc_freeres_fn .fini .rodata __libc_atexit __libc_subfreeres .eh_frame_hdr .eh_frame .gcc_except_table
   01     .ctors .dtors .jcr .data.rel.ro .dynamic .got .got.plt .data .bss __libc_freeres_ptrs
   02     .dynamic
   03     .eh_frame_hdr
   04
Comment 5 Pawel Sikora 2012-07-18 12:37:25 UTC
(gdb) r
Starting program: /lib/ld-linux.so.2 ./SceMiDpiBridge.so

Program received signal SIGFPE, Arithmetic exception.
0x5655fc17 in _dl_try_allocate_static_tls (map=0x56576918) at dl-reloc.c:69
69        size_t n = (freebytes - blsize) / map->l_tls_align;
(gdb) bt
#0  0x5655fc17 in _dl_try_allocate_static_tls (map=0x56576918) at dl-reloc.c:69
#1  0x5655fca7 in _dl_allocate_static_tls (map=0x56576918) at dl-reloc.c:118
#2  0x56561742 in elf_machine_rel (sym=0xf7f67ca8, skip_ifunc=0, reloc_addr_arg=0xf6f3ffb4, version=<optimized out>, map=0xf71028a0, 
    reloc=<optimized out>) at ../sysdeps/i386/dl-machine.h:436
#3  elf_dynamic_do_Rel (skip_ifunc=0, lazy=<optimized out>, nrelative=<optimized out>, relsize=<optimized out>, reladdr=<optimized out>, 
    map=0xf71028a0) at do-rel.h:145
#4  _dl_relocate_object (scope=0xf7102a58, reloc_mode=0, consider_profiling=0) at dl-reloc.c:265
#5  0x56557ff3 in dl_main (phdr=0xf7f65034, phnum=5, user_entry=0xffffd3bc, auxv=0xffffd510) at rtld.c:2299
#6  0x56569063 in _dl_sysdep_start (start_argptr=start_argptr@entry=0xffffd450, dl_main=dl_main@entry=0x565566e3 <dl_main>)
    at ../elf/dl-sysdep.c:242
#7  0x565599fd in _dl_start_final (arg=0xffffd450) at rtld.c:337
#8  _dl_start (arg=0xffffd450) at rtld.c:563
#9  0x565561d7 in _start () from /lib/ld-linux.so.2
(gdb) p *map
$1 = {
  l_addr = 4160114688, 
  l_name = 0x5656dd75 "", 
  l_ld = 0xf7ffa3e0, 
  l_next = 0x56576cb0, 
  l_prev = 0x0, 
  l_real = 0x56576918, 
  l_ns = 0, 
  l_libname = 0x56576bf4, 
  l_info =     {0x0,
    0xf7ffa420,
    0xf7ffa468,
    0xf7ffa460,
    0xf7ffa438,
    0xf7ffa440,
    0xf7ffa448,
    0x0,
    0x0,
    0x0,
    0xf7ffa450,
    0xf7ffa458,
    0xf7ffa428,
    0xf7ffa430,
    0x0,
    0x0,
    0x0,
    0xf7ffa480,
    0xf7ffa488,
    0xf7ffa490,
    0xf7ffa470,
    0x0,
    0xf7ffa498,
    0xf7ffa478,
    0x0,
    0x0,
    0x0,
    0x0,
    0x0,
    0x0,
    0x0,
    0x0,
    0x0,
    0x0,
    0xf7ffa4a8,
    0xf7ffa4a0,
    0x0,
    0x0,
    0x0,
    0xf7ffa4b8,
    0x0,
    0x0,
    0x0,
    0x0,
    0x0,
    0x0,
    0x0,
    0x0,
    0x0,
    0xf7ffa4b0,
    0x0 <repeats 26 times>}, 
  l_phdr = 0xf7f65034, 
  l_entry = 4160238848, 
  l_phnum = 5, 
  l_ldnum = 33, 
  l_searchlist = {
    r_list = 0xf6f1eb40, 
    r_nlist = 19
  }, 
  l_symbolic_searchlist = {
    r_list = 0x56576bf0, 
    r_nlist = 0
  }, 
  l_loader = 0x0, 
  l_versions = 0xf6f1eb90, 
  l_nversions = 6, 
  l_nbuckets = 1031, 
  l_gnu_bitmask_idxbits = 0, 
  l_gnu_shift = 0, 
  l_gnu_bitmask = 0x0, 
  {
    l_gnu_buckets = 0xf7f660f8, 
    l_chain = 0xf7f660f8
  }, 
  {
    l_gnu_chain_zero = 0xf7f650dc, 
    l_buckets = 0xf7f650dc
  }, 
  l_direct_opencount = 1, 
  l_type = lt_library, 
  l_relocated = 0, 
  l_init_called = 0, 
  l_global = 1, 
  l_reserved = 0, 
  l_phdr_allocated = 0, 
  l_soname_added = 0, 
  l_faked = 0, 
  l_need_tls_init = 0, 
  l_auditing = 0, 
  l_audit_any_plt = 0, 
  l_removed = 0, 
  l_contiguous = 1, 
  l_symbolic_in_local_scope = 0, 
  l_free_initfini = 1, 
  l_rpath_dirs = {
    dirs = 0xffffffff, 
    malloced = 0
  }, 
  l_reloc_result = 0x0, 
  l_versyms = 0xf7f6f320, 
  l_origin = 0x56576c18 "/home/users/pawels/ld_fpu_crash/.", 
  l_map_start = 4160114688, 
  l_map_end = 4160736224, 
  l_text_end = 4160724408, 
  l_scope_mem =     {0x56576a74,
    0x0,
    0x0,
    0x0}, 
  l_scope_max = 4, 
  l_scope = 0x56576ad0,
  l_local_scope =     {0x56576a74,
    0x0},
  l_dev = 2306,
  l_ino = 12067959,
  l_runpath_dirs = {
    dirs = 0xffffffff,
    malloced = 0
  },
  l_initfini = 0xf6f1eaf0,
  l_reldeps = 0x0,
  l_reldepsmax = 0,
  l_used = 1,
  l_feature_1 = 0,
  l_flags_1 = 0,
  l_flags = 0,
  l_idx = 0,
  l_mach = {
    plt = 0,
    gotplt = 0,
    tlsdesc_table = 0x0
  },
  l_lookup_cache = {
    sym = 0x0,
    type_class = 0,
    value = 0x0,
    ret = 0x0
  },
  l_tls_initimage = 0x0,
  l_tls_initimage_size = 0,
  l_tls_blocksize = 0,
  l_tls_align = 0,
  l_tls_firstbyte_offset = 0,
  l_tls_offset = 0,
  l_tls_modid = 0,
  l_relro_addr = 0,
  l_relro_size = 0,
  l_serial = 0,
  l_audit = 0x56576b70
}
Comment 6 Carlos O'Donell 2012-07-24 02:04:31 UTC
(In reply to comment #5)
> (gdb) r
> Starting program: /lib/ld-linux.so.2 ./SceMiDpiBridge.so
> 
> Program received signal SIGFPE, Arithmetic exception.
> 0x5655fc17 in _dl_try_allocate_static_tls (map=0x56576918) at dl-reloc.c:69
> 69        size_t n = (freebytes - blsize) / map->l_tls_align;

For this to happen we need to have seen a TLS relocation for a binary which apparently has no TLS which still indicates something wrong with the binary.

Could you please provide the output of `readelf -aW SceMiDpiBridge.so'?
Comment 7 Pawel Sikora 2012-07-24 06:14:33 UTC
(In reply to comment #6)
> (In reply to comment #5)
> > (gdb) r
> > Starting program: /lib/ld-linux.so.2 ./SceMiDpiBridge.so
> > 
> > Program received signal SIGFPE, Arithmetic exception.
> > 0x5655fc17 in _dl_try_allocate_static_tls (map=0x56576918) at dl-reloc.c:69
> > 69        size_t n = (freebytes - blsize) / map->l_tls_align;
> 
> For this to happen we need to have seen a TLS relocation for a binary which
> apparently has no TLS which still indicates something wrong with the binary.
> 
> Could you please provide the output of `readelf -aW SceMiDpiBridge.so'?

this .so is attached to this PR.
Comment 8 H.J. Lu 2012-09-02 20:00:15 UTC
(In reply to comment #0)
> Created attachment 6536 [details]
> testcase
> 
> $ ldd -r SceMiDpiBridge.so
>         linux-gate.so.1 (0xf76eb000)
>         libsds_server.so => not found
>         libxtor_threads_boost.so => not found
>         libaldecpli.so => not found
>         libsvdpi_exp.so => not found
>         libstdc++.so.6 => /usr/lib/libstdc++.so.6 (0xf755e000)
>         libsce_mi.so => not found
>         libScemiDpiBridgeApi.so => not found
>         libm.so.6 => /lib/libm.so.6 (0xf7520000)
>         libgcc_s.so.1 => /lib/libgcc_s.so.1 (0xf7505000)
>         librt.so.1 => /lib/librt.so.1 (0xf74fb000)
>         libc.so.6 => /lib/libc.so.6 (0xf735b000)
>         /lib/ld-linux.so.2 (0xf76ec000)
>         libpthread.so.0 => /lib/libpthread.so.0 (0xf7340000)
> Floating point exception

This DSO is bad in several ways:

[hjl@gnu-6 pr14370]$ readelf -d pr14370.so 

Dynamic section at offset 0x953e0 contains 29 entries:
  Tag        Type                         Name/Value
 0x00000001 (NEEDED)                     Shared library: [libsds_server.so]
 0x00000001 (NEEDED)                     Shared library: [libxtor_threads_boost.so]
 0x00000001 (NEEDED)                     Shared library: [libaldecpli.so]
 0x00000001 (NEEDED)                     Shared library: [libsvdpi_exp.so]
 0x00000001 (NEEDED)                     Shared library: [libstdc++.so.6]
 0x00000001 (NEEDED)                     Shared library: [libsce_mi.so]
 0x00000001 (NEEDED)                     Shared library: [libScemiDpiBridgeApi.so]
 0x00000001 (NEEDED)                     Shared library: [libm.so.6]
 0x00000001 (NEEDED)                     Shared library: [libgcc_s.so.1]
 0x0000000c (INIT)                       0x1de28
 0x0000000d (FINI)                       0x794d4
 0x00000004 (HASH)                       0xd4
 0x00000005 (STRTAB)                     0x6558
 0x00000006 (SYMTAB)                     0x21d8
 0x0000000a (STRSZ)                      15815 (bytes)
 0x0000000b (SYMENT)                     16 (bytes)
 0x00000003 (PLTGOT)                     0x955c0
 0x00000002 (PLTRELSZ)                   856 (bytes)
 0x00000014 (PLTREL)                     REL
 0x00000017 (JMPREL)                     0x1dad0
 0x00000011 (REL)                        0xabf0
 0x00000012 (RELSZ)                      77536 (bytes)
 0x00000013 (RELENT)                     8 (bytes)
 0x00000016 (TEXTREL)                    0x0
 0x6ffffffe (VERNEED)                    0xab90
 0x6fffffff (VERNEEDNUM)                 2
 0x6ffffff0 (VERSYM)                     0xa320
 0x6ffffffa (RELCOUNT)                   3970
 0x00000000 (NULL)                       0x0
[hjl@gnu-6 pr14370]$ readelf -sWr pr14370.so | grep errno | tail -5
00078dcc  0003a502 R_386_PC32             00033110   __errno_location
00033120  0000ad01 R_386_32               00097254   errno
   173: 00097254     4 OBJECT  GLOBAL DEFAULT   27 errno
   933: 00033110    37 FUNC    WEAK   DEFAULT   10 __errno_location
   959: 00097254     4 OBJECT  GLOBAL DEFAULT   27 _errno
[hjl@gnu-6 pr14370]$ 

1. It wasn't compiled with -fPIC.
2. It wasn't linked against libc.so.
3. It defines errno as int.

A testcase:

[hjl@gnu-6 pr14370]$ cat x.c 
int errno = 3;

extern void foo (void);

int
bar (void)
{
  foo ();
  return errno;
}
[hjl@gnu-6 pr14370]$ cat foo.c
void
foo (void)
{
}
[hjl@gnu-6 pr14370]$  gcc -m32 -shared -o libm.so.6 foo.c -fPIC
[hjl@gnu-6 pr14370]$ gcc -m32 x.c -c       
[hjl@gnu-6 pr14370]$ ld -m elf_i386  -shared x.o libm.so.6   
[hjl@gnu-6 pr14370]$ ldd -r a.out  
	linux-gate.so.1 =>  (0xf7ffd000)
	libm.so.6 => /lib/libm.so.6 (0xf7fad000)
	/lib/ld-linux.so.2 (0x56555000)
	libc.so.6 => /lib/libc.so.6 (0xf7dfb000)
/usr/local/bin/ldd: line 118: 17528 Floating point exceptionLD_TRACE_LOADED_OBJECTS=1 LD_WARN=yes LD_BIND_NOW=yes LD_LIBRARY_VERSION=$verify_out LD_VERBOSE= "$@"
[hjl@gnu-6 pr14370]$
Comment 9 H.J. Lu 2012-09-02 22:57:15 UTC
We can add check like:

diff --git a/sysdeps/i386/dl-machine.h b/sysdeps/i386/dl-machine.h
index 33847f0..e09b413 100644
--- a/sysdeps/i386/dl-machine.h
+++ b/sysdeps/i386/dl-machine.h
@@ -388,6 +388,11 @@ elf_machine_rel (struct link_map *map, const Elf32_Rel *reloc,
 	      {
 # ifndef RTLD_BOOTSTRAP
 #  ifndef SHARED
+		if (__builtin_expect (ELFW(ST_TYPE) (sym->st_info) != STT_NOTYPE,
+				      1)
+		    && __builtin_expect (ELFW(ST_TYPE) (sym->st_info) != STT_TLS,
+					 1))
+		  goto non_tls;
 		CHECK_STATIC_TLS (map, sym_map);
 #  else
 		if (!TRY_STATIC_TLS (map, sym_map))
@@ -418,6 +423,11 @@ elf_machine_rel (struct link_map *map, const Elf32_Rel *reloc,
 	     block we subtract the offset from that of the TLS block.  */
 	  if (sym != NULL)
 	    {
+	      if (__builtin_expect (ELFW(ST_TYPE) (sym->st_info) != STT_NOTYPE,
+				    1)
+		  && __builtin_expect (ELFW(ST_TYPE) (sym->st_info) != STT_TLS,
+				       1))
+		goto non_tls;
 	      CHECK_STATIC_TLS (map, sym_map);
 	      *reloc_addr += sym_map->l_tls_offset - sym->st_value;
 	    }
@@ -433,6 +443,11 @@ elf_machine_rel (struct link_map *map, const Elf32_Rel *reloc,
 	     thread pointer.  */
 	  if (sym != NULL)
 	    {
+	      if (__builtin_expect (ELFW(ST_TYPE) (sym->st_info) != STT_NOTYPE,
+				    1)
+		  && __builtin_expect (ELFW(ST_TYPE) (sym->st_info) != STT_TLS,
+				    1))
+		goto non_tls;
 	      CHECK_STATIC_TLS (map, sym_map);
 	      *reloc_addr += sym->st_value - sym_map->l_tls_offset;
 	    }
@@ -476,6 +491,20 @@ elf_machine_rel (struct link_map *map, const Elf32_Rel *reloc,
 	  break;
 # endif	/* !RTLD_BOOTSTRAP */
 	}
+
+# ifndef RTLD_BOOTSTRAP
+      return;
+
+non_tls:
+	{
+	  const char *strtab;
+	  strtab = (const char *) D_PTR (map, l_info[DT_STRTAB]);
+	  _dl_error_printf ("\
+			    reloc type 0x%x against non-TLS symbol `%s' in %s\n",
+			    r_type, strtab + refsym->st_name,
+			    rtld_progname ?: "<program name unknown>");
+	}
+# endif
     }
 }
Comment 10 Rich Felker 2012-09-03 02:54:32 UTC
Does this issue really need to be addressed at all? It seems like the crash has been shown to be caused by malformed shared libraries.
Comment 11 Carlos O'Donell 2012-09-03 13:38:44 UTC
(In reply to comment #10)
> Does this issue really need to be addressed at all? It seems like the crash has
> been shown to be caused by malformed shared libraries.

Rich,

I agree. I'd rather not slow down the dynamic linker. I feel like this kind of check needs to be in the static linker where it can issue a warning or throw an error that the shared library you are building broken.

H.J.,

Thank you very much for the triage! Shall we mark this RESOLVED INVALID?
Comment 12 Pawel Sikora 2012-09-03 15:28:52 UTC
(In reply to comment #11)
> (In reply to comment #10)
> > Does this issue really need to be addressed at all? It seems like the crash has
> > been shown to be caused by malformed shared libraries.
> 
> Rich,
> 
> I agree. I'd rather not slow down the dynamic linker. I feel like this kind of
> check needs to be in the static linker where it can issue a warning or throw an
> error that the shared library you are building broken.
> 
> H.J.,
> 
> Thank you very much for the triage! Shall we mark this RESOLVED INVALID?

please no, i'd like to see a binutils/ld runtime error at least for such cases.
Comment 13 H.J. Lu 2012-09-03 18:59:16 UTC
(In reply to comment #12)
> > 
> > Thank you very much for the triage! Shall we mark this RESOLVED INVALID?
> 
> please no, i'd like to see a binutils/ld runtime error at least for such cases.

ld does generate a link-time error if libc.so is used for
linking:

[hjl@gnu-6 pr14370]$ cat x.c
int errno = 3;

int
bar (void)
{
  return errno;
}
[hjl@gnu-6 pr14370]$ make libfoo.so
gcc -m32    -c -o x.o x.c
./ld -m elf_i386 -shared -o libfoo.so x.o -L/usr/lib -lm
[hjl@gnu-6 pr14370]$ make libbar.so
./ld -m elf_i386 -shared -o libbar.so x.o -L/usr/lib -lc
./ld: errno: TLS definition in /lib/libc.so.6 section .tbss mismatches non-TLS definition in x.o section .data
/lib/libc.so.6: could not read symbols: Bad value
make: *** [libbar.so] Error 1
[hjl@gnu-6 pr14370]$
Comment 14 Andreas Jaeger 2012-09-03 20:07:03 UTC
Since there are proper warnings by the tools, let's close this as invalid.
Comment 15 H.J. Lu 2012-09-04 19:09:38 UTC
I don't think ld.so should crash on bad DSO built with the
old/bad glibc/binutils:

[hjl@gnu-6 pr14370]$ cat x.c
#if 0
#include <errno.h>
#else
int errno = 3;
#endif

int
bar (void)
{
  errno = 4;
  return errno;
}
[hjl@gnu-6 pr14370]$ cat main.c 
#include <stdio.h>
#include <dlfcn.h>

int
main ()
{
  void *handle;
  int (*func)();

  handle = dlopen ("./libfoo.so", RTLD_LAZY);

  if (!handle)
    {
      fprintf (stderr, "%s\n", dlerror());
      return 1;
    }

  func = dlsym (handle, "bar");
  if (func == NULL)
    {
      fprintf (stderr, "%s\n", dlerror());
      return 1;
    }

  printf ("errno: %d\n", func ());

  dlclose (handle);

  return 0;
}
[hjl@gnu-6 pr14370]$ make GLIBC-DIR= run.dynamic
gcc -m32    -c -o main.o main.c
gcc -m32 -o dynamic main.o -ldl
gcc -m32    -c -o x.o x.c
./ld -m elf_i386 -shared -o libfoo.so x.o -lm
./dynamic
make: *** [run.dynamic] Segmentation fault
[hjl@gnu-6 pr14370]$
Comment 16 Carlos O'Donell 2012-09-04 19:40:46 UTC
(In reply to comment #15)
> I don't think ld.so should crash on bad DSO built with the
> old/bad glibc/binutils:

What is the performance impact of adding the checks?
Comment 17 H.J. Lu 2012-09-04 21:32:31 UTC
Created attachment 6624 [details]
A patch

With the patch, I got

[hjl@gnu-6 pr14370]$ LD_TRACE_LOADED_OBJECTS=1 LD_WARN=yes LD_BIND_NOW=yes LD_LIBRARY_VERSION=6   ./ld.so ./pr14370.so 
	linux-gate.so.1 (0xf7ffd000)
	libsds_server.so => not found
	libxtor_threads_boost.so => not found
	libaldecpli.so => not found
	libsvdpi_exp.so => not found
	libstdc++.so.6 => /lib/libstdc++.so.6 (0xf7e57000)
	libsce_mi.so => not found
	libScemiDpiBridgeApi.so => not found
	libm.so.6 => /lib/libm.so.6 (0xf7e2c000)
	libgcc_s.so.1 => /lib/libgcc_s.so.1 (0xf7e0f000)
	libc.so.6 => /lib/libc.so.6 (0xf7c5c000)
	./ld.so (0x56555000)
TLS definition `errno' mismatches non-TLS reference in ./pr14370.so	(/lib/libm.so.6)
undefined symbol: svSetScope	(./pr14370.so)
undefined symbol: svGetScope	(./pr14370.so)
undefined symbol: svGetCallerInfo	(./pr14370.so)
undefined symbol: svGetNameFromScope	(./pr14370.so)
undefined symbol: svGetScopeFromName	(./pr14370.so)
undefined symbol: DpiApiSetCallerInfo	(./pr14370.so)
undefined symbol: DpiApiSvSetScope	(./pr14370.so)
undefined symbol: CloseServerSession	(./pr14370.so)
undefined symbol: DpiApiAddImportXtor	(./pr14370.so)
undefined symbol: DpiApiRegisterSynchronizationEventXtor	(./pr14370.so)
undefined symbol: DpiApiSvGetUserData	(./pr14370.so)
undefined symbol: _ZN11xtorThreads9PostEventEPv	(./pr14370.so)
undefined symbol: DpiApiRegisterGroupingImportXtor	(./pr14370.so)
undefined symbol: _ZN11xtorThreads13InitThreadLibENS_8thread_tE	(./pr14370.so)
undefined symbol: DpiApiSvGetScopeFromName	(./pr14370.so)
undefined symbol: DpiApiSetSvScalarVal	(./pr14370.so)
undefined symbol: DpiApiExitHardwareSide	(./pr14370.so)
undefined symbol: DpiApiGetSvBitVecVal	(./pr14370.so)
undefined symbol: DpiApiGetSvScalarVal	(./pr14370.so)
undefined symbol: DpiApiSvGetCallerInfo	(./pr14370.so)
undefined symbol: DpiApiEnterHardwareSide	(./pr14370.so)
undefined symbol: DpiApiInitialize	(./pr14370.so)
undefined symbol: InitialiseDebugAndTrace	(./pr14370.so)
undefined symbol: StopConnectionManager	(./pr14370.so)
undefined symbol: _ZN11xtorThreads11CreateEventEPPvNS_8thread_tE	(./pr14370.so)
undefined symbol: StartConnectionManager	(./pr14370.so)
undefined symbol: DpiApiSvPutUserData	(./pr14370.so)
undefined symbol: DpiApiRunExportXtor	(./pr14370.so)
undefined symbol: DpiApiAddScopePath	(./pr14370.so)
undefined symbol: DpiApiRegisterExportXtor	(./pr14370.so)
undefined symbol: DpiApiSvGetNameFromScope	(./pr14370.so)
undefined symbol: DpiApiSvGetScope	(./pr14370.so)
undefined symbol: OpenServerSession	(./pr14370.so)
undefined symbol: DpiApiRegisterSynchronizationXtor	(./pr14370.so)
undefined symbol: _ZN11xtorThreads9WaitEventEPv	(./pr14370.so)
undefined symbol: DpiApiSetSvBitVecVal	(./pr14370.so)
[hjl@gnu-6 pr14370]$ cat x.c
#if 0
#include <errno.h>
#else
int errno = 3;
#endif

int
bar (void)
{
  errno = 4;
  return errno;
}
[hjl@gnu-6 pr14370]$ cat main.c 
#include <stdio.h>
#include <dlfcn.h>

int
main ()
{
  void *handle;
  int (*func)();

  handle = dlopen ("./libfoo.so", RTLD_LAZY);

  if (!handle)
    {
      fprintf (stderr, "%s\n", dlerror());
      return 1;
    }

  func = dlsym (handle, "bar");
  if (func == NULL)
    {
      fprintf (stderr, "%s\n", dlerror());
      return 1;
    }

  printf ("errno: %d\n", func ());

  dlclose (handle);

  return 0;
}
[hjl@gnu-6 pr14370]$ make run.dynamic
gcc -m32    -c -o main.o main.c
gcc -m32 -L. -nostdlib -nostartfiles -o dynamic \
-Wl,-dynamic-linker=/export/build/gnu/glibc-32bit/build-i686-linux/elf/ld-linux.so.2 \
-Wl,-z,nocombreloc \
/export/build/gnu/glibc-32bit/build-i686-linux/csu/crt1.o /export/build/gnu/glibc-32bit/build-i686-linux/csu/crti.o \
`gcc -m32 --print-file-name=crtbegin.o` \
main.o /export/build/gnu/glibc-32bit/build-i686-linux/dlfcn/libdl.so -Wl,-rpath,. \
-Wl,-rpath=/export/build/gnu/glibc-32bit/build-i686-linux:/export/build/gnu/glibc-32bit/build-i686-linux/dlfcn:/export/build/gnu/glibc-32bit/build-i686-linux/rt:/export/build/gnu/glibc-32bit/build-i686-linux/nptl \
/export/build/gnu/glibc-32bit/build-i686-linux/elf/ld-linux.so.2 \
/export/build/gnu/glibc-32bit/build-i686-linux/libc.so.6 /export/build/gnu/glibc-32bit/build-i686-linux/libc_nonshared.a \
-lgcc -lgcc_eh `gcc -m32 --print-file-name=crtend.o` \
/export/build/gnu/glibc-32bit/build-i686-linux/csu/crtn.o
gcc -m32    -c -o x.o x.c
./ld -m elf_i386 -shared -o libfoo.so x.o
./dynamic
./libfoo.so: non-TLS definition `errno' mismatches TLS reference in /export/build/gnu/glibc-32bit/build-i686-linux/libc.so.6
make: *** [run.dynamic] Error 1
[hjl@gnu-6 pr14370]$ 

Here are before and after timings of "make all" and "make check".

1. On x32;

Before:

598.89user 101.05system 2:26.71elapsed 477%CPU (0avgtext+0avgdata 113240maxresident)k
986.60user 556.24system 22:57.75elapsed 111%CPU (0avgtext+0avgdata 1048804maxresident)k

After:

608.99user 99.81system 2:17.10elapsed 516%CPU (0avgtext+0avgdata 113240maxresident)k
988.58user 553.93system 22:56.45elapsed 112%CPU (0avgtext+0avgdata 1048832maxresident)k

2. On x86-64:

Before

526.71user 93.03system 2:15.57elapsed 457%CPU (0avgtext+0avgdata 116028maxresident)k
974.58user 544.92system 23:13.73elapsed 109%CPU (0avgtext+0avgdata 1048760maxresident)k

After:

533.58user 91.94system 2:05.10elapsed 500%CPU (0avgtext+0avgdata 116028maxresident)k
977.58user 547.24system 23:05.13elapsed 110%CPU (0avgtext+0avgdata 1048920maxresident)k

3. On ia32,

Before

458.18user 84.22system 1:59.10elapsed 455%CPU (0avgtext+0avgdata 119016maxresident)k
921.42user 522.70system 21:57.64elapsed 109%CPU (0avgtext+0avgdata 1048888maxresident)k

After

465.45user 85.70system 1:53.79elapsed 484%CPU (0avgtext+0avgdata 119016maxresident)k
920.35user 525.40system 21:57.62elapsed 109%CPU (0avgtext+0avgdata 1048904maxresident)k
Comment 18 Carlos O'Donell 2012-09-04 22:03:15 UTC
(In reply to comment #17)
> Here are before and after timings of "make all" and "make check".

H.J., Thanks for working on this...

This looks like your patch *improved* performance, of which I'm skeptical because it adds more instructions into the _dl_symbol_lookup_x hot path.

Are the build times from a cold reboot or after a "flush" to ensure the previous build didn't change the numbers?

I don't like it that ld.so crashes, but I'm not sold that this is the best solution, but it certainly makes the dynamic linker more rugged and perhaps avoids what could possibly be used as an attack vector for a security breach.

Please post this to libc-alpha for wide review.
Comment 19 Rich Felker 2012-09-04 22:15:31 UTC
There is an infinite family of ways a bad/invalid/malicious/corrupt DSO can cause the dynamic linker to crash. Any bug of the form "dynamic linker crashes when fed an invalid DSO" should be considered invalid unless there's a strong argument to the contrary. The DSO in question was not built with old binutils or glibc; it was built with nonsensical, nondefault options and symbols that conflict with symbols in the standard library, which is a very bad form of undefined behavior.
Comment 20 H.J. Lu 2012-09-04 22:22:26 UTC
(In reply to comment #18)
> (In reply to comment #17)
> > Here are before and after timings of "make all" and "make check".
> 
> H.J., Thanks for working on this...
> 
> This looks like your patch *improved* performance, of which I'm skeptical
> because it adds more instructions into the _dl_symbol_lookup_x hot path.

The performance impact is too small to be visible unless most
of time is spent in _dl_symbol_lookup_x.


> Are the build times from a cold reboot or after a "flush" to ensure the
> previous build didn't change the numbers?

No, I just built them twice without reboot.

> I don't like it that ld.so crashes, but I'm not sold that this is the best
> solution, but it certainly makes the dynamic linker more rugged and perhaps
> avoids what could possibly be used as an attack vector for a security breach.

Same here.

> Please post this to libc-alpha for wide review.

Done.
Comment 21 Rich Felker 2012-09-04 23:26:07 UTC
Malformed DSOs are not a potential attack vector for security breach because you must already 100% trust a DSO before you can load it. A malicious DSO can run arbitrary code in its constructors before dlopen returns; there is no way around this. There should be a number of other attack vectors related to invalid pointers in the various tables used for relocations and symbol resolution.
Comment 22 Paul Pluzhnikov 2012-09-05 15:24:05 UTC
(In reply to comment #20)

> The performance impact is too small to be visible unless most
> of time is spent in _dl_symbol_lookup_x.

For a large application with 1000+ shared libraries (which is how we run unit tests here), time-to-main often *is* dominated by _dl_symbol_lookup_x.

"make && make check" are about the worst possible way to measure the impact -- they invoke small programs with (relatively) very few dynamic symbol resolutions.

Measuring startup time for Firefox or OpenOffice would (I expect) be much more meaningful.
Comment 23 H.J. Lu 2012-09-05 15:29:56 UTC
(In reply to comment #22)
> (In reply to comment #20)
> 
> > The performance impact is too small to be visible unless most
> > of time is spent in _dl_symbol_lookup_x.
> 
> For a large application with 1000+ shared libraries (which is how we run unit
> tests here), time-to-main often *is* dominated by _dl_symbol_lookup_x.

Do you have a setup to simulate this with the newly built glibc
without installing it?

> "make && make check" are about the worst possible way to measure the impact --
> they invoke small programs with (relatively) very few dynamic symbol
> resolutions.

Understand.  But it is the easiest way to test the newly built glibc.

> Measuring startup time for Firefox or OpenOffice would (I expect) be much more
> meaningful.

This isn't practical for me.
Comment 24 Ondrej Bilka 2013-10-02 22:44:01 UTC
What about test like this one which generates 1000 trivial dependencies?

cat "int TEST(){return 42;}" > test.c
cat "int main(){return 42;}" > main.c
for I in `seq 1 1000`; do
  gcc test.c -DTEST=test$I -fpic -shared -o libtest$I.so
  DL="$DL -ltest$I"
done
gcc main.c -L. $DL