This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH] [BZ #19329] Fix race during concurrent dlopen and pthread_create


* Ilya Palachev <i.palachev@samsung.com> [2015-12-29 13:38:53 +0300]:
> This patch fixes the bug
> https://sourceware.org/bugzilla/show_bug.cgi?id=19329
> that happens when pthread_create and dlopen are run concurrently in two
> parallel threads. In this case the race between two threads can happen as
> follows:
> 

thanks

i have a different fix for this, but even that's not complete.

there are several globals (related to tls) accessed in dlopen and
pthread_create without synchronization, those should be fixed
even if they haven't caused observable problems yet.

>    thread #1                                     thread #2
>       ||                                            ||
>       ..                                            ..
>       ||                                            ||
>       vv                                            vv
>     dlopen                                     pthread_create
>       ||                                            ||
>       ..                                            ..
>       ||                                            ||
>       vv                                            vv
>  dl_open_worker----------------------+  _dl_allocate_tls_init--------+
>  |                                   |  |                            |
>  |  ...                              |  |  ...                       |
>  |  _dl_add_to_slotinfo----------+   |  |                            |
>  |  |                            |   |  |                            |
>  |  |  ...                       |   |  |                            |
>  |  |                            |   |  |                            |
>  |  |  /* Statement #1.  */      |   |  |                            |
>  |  |  listp->slotinfo[idx].gen =|   |  |                            |
>  |  |  GL(dl_tls_generation) + 1;|   |  |                            |
>  |  |                            |   |  |                            |
>  |  |                            |   |  |                            |
>  |  +----------------------------+   |  |                            |
>  |                                   |  |                            |
>  |                                   |  | /* Statement #3.  */       |
>  |  /* The race window is here.  */  |  | assert (                   |
>  |                                   |  | listp->slotinfo[cnt].gen   |
>  |                                   |  | <= GL(dl_tls_generation)); |
>  |                                   |  |                            |
>  |  /* Statement #2.  */             |  |                            |
>  |  ++GL(dl_tls_generation);         |  |                            |
>  |                                   |  |                            |
>  |                                   |  |                            |
>  +-----------------------------------+  +----------------------------+
> 
> The patch changes the code so that to unify statements #1 and #2 in one
> atomic block so that to prevent the race window in which the
> statement #3 can happen.
> 
> The changes have no side effects because the changed check
>    if (any_tls && __builtin_expect (GL(dl_tls_generation) == 0, 0))
> succeeds iff any_tls is true and hence iff _dl_add_to_slotinfo was
> called previously. So it can't happen at startup when
>    GL(dl_tls_generation) == 0.
> The function _dl_add_to_slotinfo is called only from dl_open_worker
> and dl_main in elf/rtld.c where the dependencies of the currently
> executed program are loaded.
> 

this approach changes behaviour:

previously when a new dso was loaded all of its dependencies got
the same generation number: GL(dl_tls_generation)+1

with the patch each loaded dso increments the generation counter.
(so at least the overflow detection needs to be different)

i don't know if this is a problem, but my approach is conservative
and tries to fix the issue in _dl_allocate_tls_init by only
considering tls for which the generation counter is already updated.

(slotinfo etc also has to use atomics to make the changes visible
to other threads)

i can only post my fix after my holiday (sometime on the first week
of jan), but there still seem to be some fundamental issues with the
design (i think these races can be fixed, but the async-signal safe
allocation of tls will require significant changes here and without
that tls access from signal handlers is broken)

> Built and regtested for {x86_64,aarh64}-linux-gnu.
> 
> Signed-off-by: Ilya Palachev <i.palachev@samsung.com>
> 
> 	[BZ #19329]
> 	* elf/dl-open.c (dl_open_worker): Remove the incrementation of global
> 	generation counter from here, since it should happen immediately after
> 	the addition of new TLS module to the slot info list.
> 	* elf/dl-tls.c (_dl_add_to_slotinfo): Add the atomic incrementation of
> 	global generation counter just when the new TLS module is added. This
> 	prevents the race that existed before.
> ---
>  ChangeLog     | 10 ++++++++++
>  elf/dl-open.c | 14 +++++++++++---
>  elf/dl-tls.c  |  7 ++++++-
>  3 files changed, 27 insertions(+), 4 deletions(-)
> 
> diff --git a/ChangeLog b/ChangeLog
> index 16e605a..3d7b007 100644
> --- a/ChangeLog
> +++ b/ChangeLog
> @@ -1,3 +1,13 @@
> +2015-12-29  Ilya Palachev  <i.palachev@samsung.com>
> +
> +	[BZ #19329]
> +	* elf/dl-open.c (dl_open_worker): Remove the incrementation of global
> +	generation counter from here, since it should happen immediately after
> +	the addition of new TLS module to the slot info list.
> +	* elf/dl-tls.c (_dl_add_to_slotinfo): Add the atomic incrementation of
> +	global generation counter just when the new TLS module is added. This
> +	prevents the race that existed before.
> +
>  2015-12-21  Florian Weimer  <fweimer@redhat.com>
>  
>  	[BZ #19182]
> diff --git a/elf/dl-open.c b/elf/dl-open.c
> index 5429d18..a0eff3b 100644
> --- a/elf/dl-open.c
> +++ b/elf/dl-open.c
> @@ -514,7 +514,9 @@ dl_open_worker (void *a)
>  	      && first_static_tls == new->l_searchlist.r_nlist)
>  	    first_static_tls = i;
>  
> -	  /* We have to bump the generation counter.  */
> +	  /* At least one new module with TLS has been loaded. This is the
> +	     only place where the value of any_tls becomes true, and it
> +	     happen when and only when the _dl_add_to_slotinfo was called.  */
>  	  any_tls = true;
>  	}
>  
> @@ -523,8 +525,14 @@ dl_open_worker (void *a)
>  	_dl_show_scope (imap, from_scope);
>      }
>  
> -  /* Bump the generation number if necessary.  */
> -  if (any_tls && __builtin_expect (++GL(dl_tls_generation) == 0, 0))
> +  /* Check the TLS generation counter integer overflow if at least one new
> +     module with TLS has been loaded. The condition holds iff any_tls is
> +     true. It can happen only iff _dl_add_to_slotinfo was called (see above
> +     code). And _dl_add_to_slotinfo always increments the value of
> +     GL(dl_tls_generation) atomically (it's done to prevent the race).
> +     That's why the fatal error does not happen in case when
> +     GL(dl_tls_generation) == 0 before any TLS module load.  */
> +  if (any_tls && __builtin_expect (GL(dl_tls_generation) == 0, 0))
>      _dl_fatal_printf (N_("\
>  TLS generation counter wrapped!  Please report this."));
>  
> diff --git a/elf/dl-tls.c b/elf/dl-tls.c
> index 20c7e33..3cde943 100644
> --- a/elf/dl-tls.c
> +++ b/elf/dl-tls.c
> @@ -943,5 +943,10 @@ cannot create TLS data structures"));
>  
>    /* Add the information into the slotinfo data structure.  */
>    listp->slotinfo[idx].map = l;
> -  listp->slotinfo[idx].gen = GL(dl_tls_generation) + 1;
> +  /* Atomically increment the global TLS generation counter and set the
> +     generation counter of new TLS module to this updated value. This
> +     helps to prevent the race that happens if somebody tries to read
> +     the dl_tls_generation when listp->slotinfo[idx].gen is already
> +     incremented while GL(dl_tls_generation) was not.  */
> +  listp->slotinfo[idx].gen = atomic_increment_val(&GL(dl_tls_generation));
>  }
> -- 
> 2.1.3


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]