Bug 19826 - invalid pointer returned from __tls_get_addr with static linking
Summary: invalid pointer returned from __tls_get_addr with static linking
Status: RESOLVED FIXED
Alias: None
Product: glibc
Classification: Unclassified
Component: libc (show other bugs)
Version: 2.23
: P2 normal
Target Milestone: 2.25
Assignee: Alexandre Oliva
URL:
Keywords:
: 19303 (view as bug list)
Depends on:
Blocks: 17090 17620 17621 17628
  Show dependency treegraph
 
Reported: 2016-03-15 13:18 UTC by cbaylis
Modified: 2017-07-27 12:06 UTC (History)
6 users (show)

See Also:
Host:
Target:
Build:
Last reconfirmed:
fweimer: security-


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description cbaylis 2016-03-15 13:18:36 UTC
[ created from http://gcc.gnu.org/PR68802 ]

The following program fails on ARM when statically linked:

#include <cstdio>
#include <thread>

__thread int __attribute__ ((tls_model ("global-dynamic"))) x = 10;

class Thread {
  public:
    void operator()(){
      fprintf(stderr, "testing (%i) ...\n", x);
    }
};

int main(void){
  Thread t;
  std::thread th(std::ref(t));
  th.join();
  return 0;
}

Compiling:
$ arm-unknown-linux-gnueabihf-g++ -o test_static -Wall -Wl,--whole-archive -lpthread -Wl,--no-whole-archive -static tsimple.c

when run on a raspberry pi:
$ ./test_static
Segmentation fault
Comment 1 Andreas Schwab 2016-09-15 14:50:42 UTC
f8aeae347377f3dfa8cbadde057adf1827fb1d44 is the first bad commit
commit f8aeae347377f3dfa8cbadde057adf1827fb1d44
Author: Alexandre Oliva <aoliva@redhat.com>
Date:   Tue Mar 17 01:14:11 2015 -0300

    Fix DTV race, assert, DTV_SURPLUS Static TLS limit, and nptl_db garbage
    
    for  ChangeLog
    
            [BZ #17090]
            [BZ #17620]
            [BZ #17621]
            [BZ #17628]
Comment 2 Andreas Schwab 2016-09-15 14:52:18 UTC
Plain C testcase:

#include <stdio.h>
#include <pthread.h>

__thread int __attribute__ ((tls_model ("global-dynamic"))) x = 10;

void *
th (void *p)
{
  printf ("x = %d\n", x);
  return 0;
}

int
main (void)
{
  printf ("x = %d\n", x);
  pthread_t t;
  pthread_create (&t, 0, th, 0);
  pthread_join (t, 0);
}
Comment 3 Alexandre Oliva 2016-09-16 16:22:43 UTC
Thanks for the report, I'm looking into it.

The relevant question to ask IMHO is why linker relaxations are not transforming the GD accesses into LE as expected.  AFAIK GD is not expected to be functional in a fully-static environment; because of relaxations, it is only functional after dlopen and, even then, maybe only from the loaded modules themselves, because all TLS accesses in the executable are supposed to be relaxed to LE.

Anyway, I'll see whether we can remove this assumption and make it work as it used to.  Unfortunately, my access to the ARM platform is somewhat limited; can you confirm that adding an explicit successful dlopen call to the program before the GD access fixes it?  Perhaps any internal dlopening (nss, locale) would suffice.
Comment 4 Alexandre Oliva 2016-09-16 17:13:43 UTC
How do you even get references to __tls_get_addr in a static program?  I don't see that the symbol is defined in any of the static libraries!  I wonder if it's ARM-specific, or whether the call is relaxed, after all, just not correctly.
Comment 5 jsm-csl@polyomino.org.uk 2016-09-16 18:01:13 UTC
On Fri, 16 Sep 2016, aoliva at sourceware dot org wrote:

> How do you even get references to __tls_get_addr in a static program?  I don't
> see that the symbol is defined in any of the static libraries!  I wonder if
> it's ARM-specific, or whether the call is relaxed, after all, just not
> correctly.

See sysdeps/arm/libc-tls.c.

I don't know if this bug is the same problem as nptl/tst-cancel24-static 
failing, but that was traced back to the same commit some time ago 
<https://sourceware.org/ml/libc-alpha/2015-11/msg00529.html>, and appears 
on lots of architectures as can be seen in the release test results on the 
wiki.
Comment 6 Alexandre Oliva 2016-09-17 00:39:21 UTC
Thanks, Joseph, I see that a number of architectures that use this such a custom tls_get_addr that doesn't update the DTV.  I can see how this would not work well after I removed the DTV initialization code from both elf/dl-reloc.c and nptl/allocatestack.c, that didn't seem to be needed any more, and that in some cases could cause data races.  Oops.

Fortunately, the potential race seems to no longer be present.  At least I don't immediately see how the early initialization of the static entries of the DTV would cause a race, with the new structure of the DTV and the way it is updated, but I haven't thought it through yet.

This patch restores the early initialization, which should make static TLS work again on the affected platforms.  The alternative would be to bring DTV-updating abilities to every one of the custom versions of tls_get_addr, which is IMHO undesirable.  Could anyone with easy access to ARM or other affected platforms please give this a spin and let me know how it goes?  I've only regression-tested it on x86_64-linux-gnu so far, and at least there this logical reversal (because the data structure changed, it's not literal) doesn't cause any further regressions.  Obviously, it doesn't prove the problem is fixed either, since x86_64 didn't trigger it to begin with.

Thanks in advance,

diff --git a/elf/dl-reloc.c b/elf/dl-reloc.c
index 42bddc1..dcab666 100644
--- a/elf/dl-reloc.c
+++ b/elf/dl-reloc.c
@@ -137,6 +137,12 @@ _dl_nothread_init_static_tls (struct link_map *map)
 # error "Either TLS_TCB_AT_TP or TLS_DTV_AT_TP must be defined"
 #endif
 
+  /* Fill in the DTV slot so that a later LD/GD access will find it.  */
+  dtv_t *dtv = THREAD_DTV ();
+  assert (map->l_tls_modid <= dtv[-1].counter);
+  dtv[map->l_tls_modid].pointer.to_free = NULL;
+  dtv[map->l_tls_modid].pointer.val = dest;
+
   /* Initialize the memory.  */
   memset (__mempcpy (dest, map->l_tls_initimage, map->l_tls_initimage_size),
          '\0', map->l_tls_blocksize - map->l_tls_initimage_size);
diff --git a/nptl/allocatestack.c b/nptl/allocatestack.c
index 60b34dc..e49005c 100644
--- a/nptl/allocatestack.c
+++ b/nptl/allocatestack.c
@@ -1199,6 +1199,7 @@ __nptl_setxid (struct xid_command *cmdp)
 static inline void __attribute__((always_inline))
 init_one_static_tls (struct pthread *curp, struct link_map *map)
 {
+  dtv_t *dtv = GET_DTV (TLS_TPADJ (curp));
 # if TLS_TCB_AT_TP
   void *dest = (char *) curp - map->l_tls_offset;
 # elif TLS_DTV_AT_TP
@@ -1207,9 +1208,14 @@ init_one_static_tls (struct pthread *curp, struct link_map *map)
 #  error "Either TLS_TCB_AT_TP or TLS_DTV_AT_TP must be defined"
 # endif
 
-  /* We cannot delay the initialization of the Static TLS area, since
-     it can be accessed with LE or IE, but since the DTV is only used
-     by GD and LD, we can delay its update to avoid a race.  */
+  /* Fill in the DTV slot so that a later LD/GD access will find it.
+     A number of platforms require at least the main executable's DTV
+     to be set up for static TLS to work, because they don't require
+     linker relaxations.  */
+  dtv[map->l_tls_modid].pointer.to_free = NULL;
+  dtv[map->l_tls_modid].pointer.val = dest;
+
+  /* Initialize the memory.  */
   memset (__mempcpy (dest, map->l_tls_initimage, map->l_tls_initimage_size),
          '\0', map->l_tls_blocksize - map->l_tls_initimage_size);
 }
Comment 7 Andreas Schwab 2016-09-17 13:38:47 UTC
That patch doesn't help.
Comment 8 Andreas Schwab 2016-09-17 16:32:10 UTC
Neither of the two functions is actually called in the test case.  Any fixes need to be made to _dl_allocate_tls_init.
Comment 9 Alexandre Oliva 2016-09-20 06:37:27 UTC
Err...  I don't see where _dl_allocate_tls_init is called either.  AFAICT it is __libc_setup_tls (csu/libc-tls.c) that sets up the DTV in static programs.  The weirdness I see is that it appears to set up DTV[2] with the main program's TLS static block pointer, but then __tls_get_addr appears to access DTV[1].
I'm still trying to create a static binary linked with current glibc that will run on either my phone (Neo FreeRunner) or my BBB, but I'm not having much luck so far.  Maybe I'll resort to the Yeeloong: it seems like the same problem occurs on MIPS.
Comment 10 Andreas Schwab 2016-09-20 07:56:11 UTC
You can also run it on a aarch64 system, there are a few on the GCC compile farm.

#0  _dl_allocate_tls_init (result=0xf77ff7c0) at dl-tls.c:476
#1  0x000129fc in allocate_stack (stack=<synthetic pointer>, 
    pdp=<synthetic pointer>, attr=<optimized out>) at allocatestack.c:584
#2  __pthread_create_2_1 (newthread=<optimized out>, attr=<optimized out>, 
    start_routine=<optimized out>, arg=<optimized out>) at pthread_create.c:539
#3  0x00010518 in main () at tls.c:18

sysdeps/arm/nptl/tls.h:
/* Install the dtv pointer.  The pointer passed is to the element with
   index -1 which contain the length.  */
# define INSTALL_DTV(tcbp, dtvp) \
  (((tcbhead_t *) (tcbp))->dtv = (dtvp) + 1)
Comment 11 Alexandre Oliva 2016-09-20 16:18:59 UTC
aarch64 was the first thing I tried, but the compile farm hosts won't let me in.  It seems like the old key types with which I registered are no longer acceptable to either my local ssh client or their ssh server, and my newer keys are not there, so I can't use them yet.  I sent an update request last week, but I haven't got a response yet.

As for _dl_tls_allocate_init, thanks for the backtrace.  I was expecting it a lot earlier than that.  The report didn't make it clear the problem was in the separate thread, rather than in the main one.
Comment 12 Alexandre Oliva 2016-09-20 16:48:57 UTC
Given all the information I got so far, it looks like this will fix it:

diff --git a/elf/dl-tls.c b/elf/dl-tls.c
index 17567ad..60f4c1d 100644
--- a/elf/dl-tls.c
+++ b/elf/dl-tls.c
@@ -538,6 +538,10 @@ _dl_allocate_tls_init (void *result)
 # error "Either TLS_TCB_AT_TP or TLS_DTV_AT_TP must be defined"
 #endif
 
+         /* Set up the DTV entry.  The simplified __tls_get_addr that
+            some platforms use in static programs requires it.  */
+         dtv[map->l_tls_modid].pointer.val = dest;
+
          /* Copy the initialization image and clear the BSS part.  */
          memset (__mempcpy (dest, map->l_tls_initimage,
                             map->l_tls_initimage_size), '\0',
Comment 13 Andreas Schwab 2016-09-20 20:54:56 UTC
That fixes the issue.
Comment 14 Sourceware Commits 2016-09-22 01:04:08 UTC
This is an automated email from the git hooks/post-receive script. It was
generated because a ref change was pushed to the repository containing
the project "GNU C Library master sources".

The branch, master has been updated
       via  17af5da98cd2c9ec958421ae2108f877e0945451 (commit)
      from  444eacba82f675d4657ad55da67b355536be90ab (commit)

Those revisions listed above that are new to this repository have
not appeared on any other notification email; so we list those
revisions in full, below.

- Log -----------------------------------------------------------------
https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=17af5da98cd2c9ec958421ae2108f877e0945451

commit 17af5da98cd2c9ec958421ae2108f877e0945451
Author: Alexandre Oliva <aoliva@redhat.com>
Date:   Wed Sep 21 22:01:16 2016 -0300

    [PR19826] fix non-LE TLS in static programs
    
    An earlier fix for TLS dropped early initialization of DTV entries for
    modules using static TLS, leaving it for __tls_get_addr to set them
    up.  That worked on platforms that require the GD access model to be
    relaxed to LE in the main executable, but it caused a regression on
    platforms that allow GD in the main executable, particularly in
    statically-linked programs: they use a custom __tls_get_addr that does
    not update the DTV, which fails when the DTV early initialization is
    not performed.
    
    In static programs, __libc_setup_tls performs the DTV initialization
    for the main thread, but the DTV of other threads is set up in
    _dl_allocate_tls_init, so that's the fix that matters.
    
    Restoring the initialization in the remaining functions modified by
    this patch was just for uniformity.  It's not clear that it is ever
    needed: even on platforms that allow GD in the main executable, the
    dynamically-linked version of __tls_get_addr would set up the DTV
    entries, even for static TLS modules, while updating the DTV counter.
    
    for  ChangeLog
    
    	[BZ #19826]
    	* elf/dl-tls.c (_dl_allocate_tls_init): Restore DTV early
    	initialization of static TLS entries.
    	* elf/dl-reloc.c (_dl_nothread_init_static_tls): Likewise.
    	* nptl/allocatestack.c (init_one_static_tls): Likewise.

-----------------------------------------------------------------------

Summary of changes:
 ChangeLog            |    8 ++++++++
 elf/dl-reloc.c       |    6 ++++++
 elf/dl-tls.c         |    4 ++++
 nptl/allocatestack.c |    9 ++++++---
 4 files changed, 24 insertions(+), 3 deletions(-)
Comment 15 Alexandre Oliva 2016-09-22 01:05:09 UTC
Fixed
Comment 16 jsm-csl@polyomino.org.uk 2016-09-22 01:38:33 UTC
Please set the target milestone (to the first mainline version with the 
fix) when marking a bug as fixed.
Comment 17 Alexandre Oliva 2016-09-23 17:08:02 UTC
Thanks for the reminder.  It would be nice of bugzilla helped enforce this requirement somehow; it seems like the sort of thing that it would be desirable to automate in the interest of reducing mistakes.
Comment 18 Florian Weimer 2017-07-27 12:06:46 UTC
*** Bug 19303 has been marked as a duplicate of this bug. ***