This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH] aio: fix newp->running data race


On Fri, 2016-07-08 at 16:28 +0200, Samuel Thibault wrote:
> Ping?

I've just looked quickly at the patch.  Fixes for concurrency issues
should be accompanied with detailed comments, so that other developers
can easily grasp how synchronization is supposed to work.  In this case,
for example, it should be documented why runp->running can't be accessed
concurrently from several threads.
Doing this properly may add a lot of comments to the code if there are
missing elsewhere, so we might need to find some middle ground if this
patch is meant to become part of 2.24.  But it will need something.  See
the concurrency notes in ./stdlib/cxa_thread_atexit_impl.c for an
example, or the code comments in nptl/sem_waitcommon.c.

> Samuel Thibault, on Wed 04 May 2016 16:02:17 +0200, wrote:
> > Hello,
> > 
> > As reported by helgrind:
> > 
> > ==9363== Possible data race during write of size 4 at 0x6B45D18 by thread #5
> > ==9363== Locks held: 2, at addresses 0x557A1E0 0x6A149E0
> > ==9363==    at 0x5375E0D: __aio_enqueue_request (in /lib/x86_64-linux-gnu/librt-2.21.so)
> > ==9363==    by 0x53764FD: aio_write (in /lib/x86_64-linux-gnu/librt-2.21.so)
> > 
> > which is the write in
> > 
> >   if (result == 0)
> >     newp->running = running;
> > 
> > ==9363== This conflicts with a previous read of size 4 by thread #6
> > ==9363== Locks held: none
> > ==9363==    at 0x53758FE: handle_fildes_io (in /lib/x86_64-linux-gnu/librt-2.21.so)
> > 
> > which is the read in 
> > 
> > 	  /* Hopefully this request is marked as running.  */
> > 	  assert (runp->running == allocated);
> > 
> > So what is happening is that __aio_enqueue_request writes newp->running
> > after having possibly started a thread worker to handle the request,
> > thus in concurrence with the thread worker, which not only reads
> > runp->running as shown here, but also writes to it later (and does not
> > take __aio_requests_mutex since it's already given a request).
> > 
> > But actually when we do start a new thread worker, there is no need
> > to set newp->running, since it was already set by the "running =
> > newp->running = allocated;" line along the thread creation.  So to avoid
> > the race we can just set newp->running in the cases where we do not
> > start the thread.
> > 
> > Samuel
> > 
> > 
> >     	* sysdeps/pthread/aio_misc.c (__aio_enqueue_request): Do not write
> >     	`running` field of `newp` when a thread was started to process it,
> >     	since that thread will not take `__aio_requests_mutex`, and the field
> >     	already has the proper value actually.
> > 
> > diff --git a/sysdeps/pthread/aio_misc.c b/sysdeps/pthread/aio_misc.c
> > index f55570d..faf139d 100644
> > --- a/sysdeps/pthread/aio_misc.c
> > +++ b/sysdeps/pthread/aio_misc.c
> > @@ -453,7 +453,11 @@ __aio_enqueue_request (aiocb_union *aiocbp, int operation)
> >  		result = 0;
> >  	    }
> >  	}
> > +      else
> > +	newp->running = running;
> >      }
> > +  else
> > +    newp->running = running;
> >  
> >    /* Enqueue the request in the run queue if it is not yet running.  */
> >    if (running == yes && result == 0)
> > @@ -466,9 +470,7 @@ __aio_enqueue_request (aiocb_union *aiocbp, int operation)
> >  	pthread_cond_signal (&__aio_new_request_notification);
> >      }
> >  
> > -  if (result == 0)
> > -    newp->running = running;
> > -  else
> > +  if (result != 0)
> >      {
> >        /* Something went wrong.  */
> >        __aio_free_request (newp);




Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]