This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PR19826] fix non-LE TLS in static programs


On Sep 21, 2016, Andreas Schwab <schwab@suse.de> wrote:

> On Sep 21 2016, Alexandre Oliva <aoliva@redhat.com> wrote:
>> [BZ #19826]
>> * elf/dl-tls.c (_dl_allocate_tls_init): Restore DTV early
>> initialization of static TLS entries.
>> * elf/dl-reloc.c (_dl_nothread_init_static_tls): Likewise.
>> * nptl/allocatestack.c (init_one_static_tls): Likewise.

> Ok.

No, it's not!  :-)

:-(


We'll need something along these lines, to fix the above, and to avoid
making the same mistake again.

I'm yet to test it myself, but I understand you'd already tested on ARM
the combination of the previous patch and this reversal, minus the
comments added below, i.e., the elf/dl-tls.c changes above.  Is that so?

Once I'm done with the testing, is this ok to install on top of the
previous patch?


[PR19826] revert needlessly reintroduced race, explain it

Along with the proper fix for the problem, a couple of unnecessary DTV
initializations were reintroduced, but they weren't just unnecessary:
one of them introduced a race condition that was alluded to in the
comments.

In this patch, I add a long comment explaining why we can't modify
another threads' DTV while initializing its static TLS block during
dlopen.

I also revert the unnecessary parts of the previous change, removing
the DTV-initializing code from the above, to fix the race, and, for
uniformity, also from the corresponding code that runs in
single-threaded programs, that was harmless as far as I can tell.


for  ChangeLog

	[BZ #19826]
	* nptl/allocatestack.c (init_one_static_tls): Revert the
	previous change and explain the race condition it would
	introduce.
	* elf/dl-reloc.c (_dl_nothread_init_static_tls): Revert
	previous change for uniformity.
---
 elf/dl-reloc.c       |    6 ------
 nptl/allocatestack.c |   36 ++++++++++++++++++++++++++++++------
 2 files changed, 30 insertions(+), 12 deletions(-)

diff --git a/elf/dl-reloc.c b/elf/dl-reloc.c
index dcab666..42bddc1 100644
--- a/elf/dl-reloc.c
+++ b/elf/dl-reloc.c
@@ -137,12 +137,6 @@ _dl_nothread_init_static_tls (struct link_map *map)
 # error "Either TLS_TCB_AT_TP or TLS_DTV_AT_TP must be defined"
 #endif
 
-  /* Fill in the DTV slot so that a later LD/GD access will find it.  */
-  dtv_t *dtv = THREAD_DTV ();
-  assert (map->l_tls_modid <= dtv[-1].counter);
-  dtv[map->l_tls_modid].pointer.to_free = NULL;
-  dtv[map->l_tls_modid].pointer.val = dest;
-
   /* Initialize the memory.  */
   memset (__mempcpy (dest, map->l_tls_initimage, map->l_tls_initimage_size),
 	  '\0', map->l_tls_blocksize - map->l_tls_initimage_size);
diff --git a/nptl/allocatestack.c b/nptl/allocatestack.c
index 3016a2e..38b6530 100644
--- a/nptl/allocatestack.c
+++ b/nptl/allocatestack.c
@@ -1207,12 +1207,36 @@ init_one_static_tls (struct pthread *curp, struct link_map *map)
 #  error "Either TLS_TCB_AT_TP or TLS_DTV_AT_TP must be defined"
 # endif
 
-  /* Fill in the DTV slot so that a later LD/GD access will find it.  */
-  dtv_t *dtv = GET_DTV (TLS_TPADJ (curp));
-  dtv[map->l_tls_modid].pointer.to_free = NULL;
-  dtv[map->l_tls_modid].pointer.val = dest;
-
-  /* Initialize the memory.  */
+  /* We cannot delay the initialization of the Static TLS area, since
+     it can be accessed with LE or IE, but since the DTV is only used
+     by GD and LD (*) we can delay its update to avoid a race (**).
+
+     (*) The main program's DTV of statically-linked programs may be
+     used by a custom __tls_get_addr used on platforms that don't
+     require TLS relaxations, but that one is never initialized by us,
+     since the main program is never dlopened; the main thread's DTV
+     entry for the main program gets initialized in __libc_setup_tls,
+     whereas other threads' corresponding DTV entries are initialized
+     by _dl_allocate_tls_init.
+
+     (**) If we were to write to other threads' DTV entries here, we
+     might race with their DTV resizing.  We might then write to
+     stale, possibly already free()d and reallocated memory.  Or we
+     could write past the end of the still-active DTV.  So, don't do
+     that!
+
+     Another race, not to be mistaken for the above, has to do with
+     the TLS block initialized below.  Although we perform this
+     initialization while holding the rtld lock, once dlopen
+     completes, other threads might run code that accesses the
+     variables in these TLS blocks, without taking the rtld lock or
+     any other explicit memory acquire.
+
+     Our implementation appears to assume that, in order for the other
+     threads to access those variables, there will be some form of
+     synchronization between the end of dlopen() in one thread and the
+     use of the just-loaded modules in others, otherwise the very use
+     of the loaded code could be undefined.  */
   memset (__mempcpy (dest, map->l_tls_initimage, map->l_tls_initimage_size),
 	  '\0', map->l_tls_blocksize - map->l_tls_initimage_size);
 }


-- 
Alexandre Oliva, freedom fighter    http://FSFLA.org/~lxoliva/
You must be the change you wish to see in the world. -- Gandhi
Be Free! -- http://FSFLA.org/   FSF Latin America board member
Free Software Evangelist|Red Hat Brasil GNU Toolchain Engineer


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]