This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

Re: [PATCH v2] Fix double-checked locking in __gconv_get_path and __gconv_read_conf [BZ #22062]

From: Carlos O'Donell <carlos at redhat dot com>
To: Arjun Shankar <arjun dot is at lostca dot se>, libc-alpha at sourceware dot org
Cc: Florian Weimer <fweimer at redhat dot com>
Date: Thu, 5 Oct 2017 13:22:52 -0700
Subject: Re: [PATCH v2] Fix double-checked locking in __gconv_get_path and __gconv_read_conf [BZ #22062]
Authentication-results: sourceware.org; auth=none
References: <20171002143148.GA41930@aloka.lostca.se>

On 10/02/2017 07:31 AM, Arjun Shankar wrote:
> These two functions running in different threads could encounter a data race
> if the CPU or compiler reordered the store of __gconv_path_elem so as to be
> earlier than writes to the content that it points to. This has now been
> corrected by using atomics every time the variable is accessed.

My reviews are structured into 3 pats: high level, design, and
implementation.

(a) At a high level I see a problem with this change.

All of the callers of __gconv_read_conf do so using __libc_once.

Why can't we just delete all the double checked locking assuming that
__libc_once works correctly?

The use of __libc_once ensures that __gconv_read_conf is called only
once, and that takes into account all threads.

The only scenario it doesn't take into account is if an in-progress
call to __libc_once was interrupted by a dlopen of libpthread that
then started threads. Those threads might attempt a parallel
initialization. Though this problem has existed *before* the changes
you are making.

Either way, because of the use of __libc_once, the function need not
do any locking or atomics.

Was the __libc_once usage by the callers considered when the patch
was created?

(b) At the design level, the changes look good.

What you've done is a correct fix of DCLP (thought at a high level
it might not be needed and we might just delete the code and update
the comment to say it must be called with __libc_once).

I see perhaps one case where an atomic operation is not needed, and
I've commented on that below.

(c) The details of the patch are great, and the comments are good.

> ChangeLog:
> 
> 2017-10-02  Arjun Shankar  <arjun@redhat.com>
> 
> 	[BZ #22062]
> 	* iconv/gconv_conf.c: Include <atomic.h>.
> 	(__gconv_get_path): Access __gconv_path_elem using atomics.
> 	(__gconv_read_conf): Likewise.
> 	(libc_freeres_fn): Likewise.
> ---
> Discussion on PATCH v1:
> https://sourceware.org/ml/libc-alpha/2017-09/msg00779.html
> 
>  iconv/gconv_conf.c | 42 ++++++++++++++++++++++++++++++++----------
>  1 file changed, 32 insertions(+), 10 deletions(-)
> 
> diff --git a/iconv/gconv_conf.c b/iconv/gconv_conf.c
> index f1c28ce..58fd28e 100644
> --- a/iconv/gconv_conf.c
> +++ b/iconv/gconv_conf.c
> @@ -30,6 +30,7 @@
>  #include <string.h>
>  #include <unistd.h>
>  #include <sys/param.h>
> +#include <atomic.h>

OK.

>  
>  #include <libc-lock.h>
>  #include <gconv_int.h>
> @@ -428,8 +429,13 @@ __gconv_get_path (void)
>  
>    __libc_lock_lock (lock);
>  

All of the caller of __gconv_read_conf do so using __libc_once, which 
already does locking using pthread_once to avoid __gconv_read_conf
ever being called by a another thread?

e.g.
iconv/gconv_db.c:  __libc_once (once, __gconv_read_conf);
iconv/gconv_db.c:  __libc_once (once, __gconv_read_conf);

See my comments above.

I will continue reviewing the DCLP usage below because it has value.

> -  /* Make sure there wasn't a second thread doing it already.  */
> -  result = (struct path_elem *) __gconv_path_elem;
> +  /* __gconv_read_conf uses double-checked locking and therefore can make
> +     a redundant call to this function while another thread is already
> +     executing it.  So first we make sure another thread has not already done
> +     the work we want to do.
> +
> +     A relaxed MO load is sufficient since we already have the lock.  */
> +  result = atomic_load_relaxed (&__gconv_path_elem);

OK. Good, this removes the data race between the unordered read by a thread
in __gconv_read_conf and a write by another thread in __gonv_get_path.

>    if (result == NULL)
>      {
>        /* Determine the complete path first.  */
> @@ -519,7 +525,10 @@ __gconv_get_path (void)
>  	  result[n].len = 0;
>  	}
>  
> -      __gconv_path_elem = result ?: (struct path_elem *) &empty_path_elem;
> +      /* This store synchronizes with the acquire MO load in
> +         __gconv_read_conf.  */
> +      atomic_store_release (&__gconv_path_elem,
> +                            result ?: (struct path_elem *) &empty_path_elem);

OK. Good comment.

>  
>        free (cwd);
>      }
> @@ -538,6 +547,7 @@ __gconv_read_conf (void)
>    size_t nmodules = 0;
>    int save_errno = errno;
>    size_t cnt;
> +  struct path_elem *gconv_path_elem_local;

OK.

>  
>    /* First see whether we should use the cache.  */
>    if (__gconv_load_cache () == 0)
> @@ -549,13 +559,20 @@ __gconv_read_conf (void)
>  
>  #ifndef STATIC_GCONV
>    /* Find out where we have to look.  */
> -  if (__gconv_path_elem == NULL)
> -    __gconv_get_path ();
>  
> -  for (cnt = 0; __gconv_path_elem[cnt].name != NULL; ++cnt)
> +  /* This load is synchronized with the release MO store done during the
> +     initialization of __gconv_path_elem in __gconv_get_path.  */

OK. Perfect. If the acquire sees NULL it knows that the other thread might
not have completed the work, and so proceeds to acquire the lock and check
again (double checked locking). If it sees non-NULL it knows the other function
completed all of the work and the element data should be visible.

> +  gconv_path_elem_local = atomic_load_acquire (&__gconv_path_elem);
> +  if (gconv_path_elem_local == NULL)
> +    {
> +      __gconv_get_path ();

Multiple reads by threads are safe. In this case we can have multiple threads
here, but all write to __gconv_path_elem are complete. All we are doing is
having multiple readers, which is not a data race.

Is there any reason we need this atomic load?

> +      gconv_path_elem_local = atomic_load_relaxed (&__gconv_path_elem);
> +    }
> +
> +  for (cnt = 0; gconv_path_elem_local[cnt].name != NULL; ++cnt)
>      {
> -      const char *elem = __gconv_path_elem[cnt].name;
> -      size_t elem_len = __gconv_path_elem[cnt].len;
> +      const char *elem = gconv_path_elem_local[cnt].name;
> +      size_t elem_len = gconv_path_elem_local[cnt].len;

OK.

>        char *filename;
>  
>        /* No slash needs to be inserted between elem and gconv_conf_filename;
> @@ -606,6 +623,11 @@ __gconv_read_conf (void)
>  /* Free all resources if necessary.  */
>  libc_freeres_fn (free_mem)
>  {
> -  if (__gconv_path_elem != NULL && __gconv_path_elem != &empty_path_elem)
> -    free ((void *) __gconv_path_elem);
> +  /* __gconv_path_elem is always accessed atomically because it is used in
> +     double-checked locking.  Since this function is called when the process
> +     has become single-threaded, it is enough to used a relaxed MO load.  */

OK. We use an atomic load to see the update from a thread that initialized
__gconv_path_elem.

> +  struct path_elem *gconv_path_elem_local
> +    = atomic_load_relaxed (&__gconv_path_elem);
> +  if (gconv_path_elem_local != NULL && gconv_path_elem_local != &empty_path_elem)
> +    free ((void *) gconv_path_elem_local);
>  }
> 

-- 
Cheers,
Carlos.

Follow-Ups:
- Re: [PATCH v2] Fix double-checked locking in __gconv_get_path and __gconv_read_conf [BZ #22062]
  - From: Florian Weimer

References:
- [PATCH v2] Fix double-checked locking in __gconv_get_path and __gconv_read_conf [BZ #22062]
  - From: Arjun Shankar

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]