This is the mail archive of the
mailing list for the glibc project.
Re: [PATCH] Make pthread_getspecific async-signal-safe
- From: Torvald Riegel <triegel at redhat dot com>
- To: Andrew Hunter <ahh at google dot com>
- Cc: Roland McGrath <roland at hack dot frob dot com>, GNU C Library <libc-alpha at sourceware dot org>, Paul Pluzhnikov <ppluzhnikov at google dot com>
- Date: Fri, 19 Dec 2014 18:18:02 +0100
- Subject: Re: [PATCH] Make pthread_getspecific async-signal-safe
- Authentication-results: sourceware.org; auth=none
- References: <1418756227-17380-1-git-send-email-ahh at google dot com> <CADroS=4w2E5u5iCmoANFy=QeHMy8wP2VxfCh9QustDTdisw2Jw at mail dot gmail dot com> <20141216203128 dot D5E0E2C2448 at topped-with-meat dot com> <1418763926 dot 7165 dot 33 dot camel at triegel dot csb> <CADroS=6qkjyNBvZ2EhAeJZhFQhgw8v2cAjhAoXRDwyTYV=GDrQ at mail dot gmail dot com>
On Tue, 2014-12-16 at 13:49 -0800, Andrew Hunter wrote:
> On Tue, Dec 16, 2014 at 1:05 PM, Torvald Riegel <email@example.com> wrote:
> > I agree.
> > Also, you are introducing concurrency, and we require new glibc code to
> > be data-race-free as defined by C11:
> > https://sourceware.org/glibc/wiki/Concurrency
> > Therefore, please use atomic operations where necessary, and document
> > how and why that works.
> > I'm aware that this will be a little challenging given that we don't
> > have existing support for sigatomic_t in glibc, and that the C and C++
> > standards could specify what signal handlers are allowed to do more
> > clearly.
> > However, we do need to request the compiler to not reorder memory
> > accesses
> No, we don't. No ordering constraints are required anywhere in this
> patch  as I understand it.
The getspecific call executed by the signal handler is concurrent with a
setspecific executed by the interrupted thread, right? If this was
concurrency between threads, you couldn't implement this correctly with
just relaxed memory order atomic accesses, right?
The signal handler concurrency is similar, just that the CPU won't
introduce reordering, but the compiler can still do that. It's not
aware that this is actually concurrent code, unless you tell it.
> > (ISTM you are relying on an ordering of accesses to the seq and
> > data fields in this patch).
> No; the current code at HEAD relies on this (at least if getspecific
> is used from a signal handler); things work fine if we end up writing
> seq then data, but writing data then seq leads to loss of values. My
> patch eliminates that dependence. (Racy get during a set might see
> the old value or NULL, but that's going to be a possible outcome for
> any racy get.)
In setspecific, there's nothing forcing the compiler to generate the
stores to seq and data in exactly the order they have in the source
Likewise, in getspecific, the compiler would be free to read data
speculatively, for some reason. This is perfectly fine in sequential
code: The compiler can assume nothing will interfere concurrently, and
if the compiler knows that the load from data won't cause a segfault, it
can load speculatively.
This may be unlikely in the current code but using LTO, for example,
could change the likelihood significantly.
> We could instead use a patch that in fact enforces the correct order
> in setspecific, and it wouldn't be particularly difficult once we
> decide, essentially, how to write barriers.
Yes. We need to tell the compiler which ordering we need in both
setspecific and getspecific.
> But my patch should
> obviate the need for any such decision (for now :)).
I agree that that compilers will probably generate code that works. But
we don't want to rely on this just being probable -- so we need to
> Strictly speaking as I understand the standard, any objects read need
> to be of atomic types--though relaxed loads would be sufficient
> throughout--but it's very unclear to me if that's worth doing here.
The compromise we currently use in glibc is to use nonatomic types but
explicitly atomic operations on all accesses. We don't have atomics
that rely on lock-based implementations or such, so we don't need any
extra data for an atomic; we may need some alignment and size properties
(e.g. if a CPU can just do atomic ops on aligned data), but that's
something we take care of in the data structure / variable definitions.
See https://sourceware.org/glibc/wiki/Concurrency for some more details.
> The only changes would be essentially erasing the types of the members
> of pthread_key_data so we can "atomically" access them--do you really
> see that as necessary here? I'm aware of the new code/old code
> distinctions in glibc, but consider for example sighandler_setxid - do
> you want to rewrite struct pthread.pid to atomic so it can be read
> there? Let's not even talk about __pthread_unwind (called from the
> cancellation signal...).
> Given that there are no actual ordering constraints, do we still need
> to respect the need for "atomics" here?
Yes. The compromise (see above) makes this simpler, tells the compiler
what's going on and that a certain access is part of concurrent code,
not sequential code (e.g., telling it that reloading from memory is not
fine). It also documents that this is concurrent code to readers.
> > The simple thing would be to use normal
> > atomics with acquire/release memory order, but we don't want this
> > overhead for TLS. So we will need something else. I'd suggest
> > exploring two options: (1) relaxed memory order atomics and explicit
> > compiler barriers or (2) atomic operations for sigatomic_t with
> > something like acq/rel barriers. The latter might be cleaner and more
> > intuitive to use, but would need more invention.
> I'll have to consult some more standardese experts, but I think the
> correct way to do this would be an
> in setspecific between level2->seq = seq and level2->data = value,
> but the linked documentation says to prefer local variants to C11 APIs
> and you don't seem to have one for this.
Yes, atomic_signal_fence is not part of include/atomic.h yet. I suggest
you add it, using either the GCC builtins or a compiler barrier as
fallback (see the bottom of the file; there are two paths that define
C11-like atomic ops and fences).
Adding a proper fence in setspecific and getspecific should make the
code correct in the sense of enforcing that it will be correct.