This is the mail archive of the glibc-bugs@sourceware.org mailing list for the glibc project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

[Bug nptl/13065] Race condition in pthread barriers

From: "bugdal at aerifal dot cx" <sourceware-bugzilla at sourceware dot org>
To: glibc-bugs at sourceware dot org
Date: Fri, 20 Dec 2013 18:39:05 +0000
Subject: [Bug nptl/13065] Race condition in pthread barriers
Auto-submitted: auto-generated
References: <bug-13065-131 at http dot sourceware dot org/bugzilla/>

https://sourceware.org/bugzilla/show_bug.cgi?id=13065

--- Comment #3 from Rich Felker <bugdal at aerifal dot cx> ---
I don't think signals make it any more complicated. Implementation-wise, there
are two possibilities:

1. A waiter stuck in a signal handler blocks other waiters until it returns. In
this case, no waiter returns while the one waiter is still in the signal
handler, and there's no special issue to deal with.

2. A waiter stuck in a signal handler allows other waiters to proceed once the
last waiter has arrived. The ONLY way to implement this is to have some
resource identifying the barrier instance (keep in mind: as soon as any waiter
returns from the wait, the barrier is ready for reuse as a new instance) whose
lifetime persists until the signal handler returns. In order to avoid requiring
dynamic resource allocation for each barrier instance (which could fail,
rendering barriers unsafe for any actual synchronization usage) the resource
must essentially have its storage associated with the threads involved in the
barrier instance (e.g. on their stacks, TLS, kernel task structures, etc.). In
such an implementation, the thread stuck in the signal handler needs to be
finished working with the barrier resource itself (since it could be reused for
a new barrier instance, or destroyed) and must perform its waiting based on the
instance resource associated with the waiting threads.

In case it's not clear, what I'm arguing is not in regards to what the standard
says about barriers and self-synchronized destruction. My argument is that, in
either case, there's no additional barrier-specific difficulty to supporting
self-synchronized destruction. The other requirements of making the barrier
implementation correct already put you in a good position for supporting
self-synchronized destruction where it's no more difficult than for mutexes or
semaphores.

BTW, since a thread's status as being a waiter on a barrier is not a testable
condition (i.e. there's no way to measure whether it's waiting on the barrier
versus suspended awaiting scheduling just prior to waiting on the barrier) the
standard has no choice but to allow the option where signal handlers block
forward process of other waiters. Allowing the other option, however, does
create the possibility of observable behavior; if an implementation takes this
option, you may observe other threads exiting the barrier wait while the signal
handler is still running.

-- 
You are receiving this mail because:
You are on the CC list for the bug.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]