This is the mail archive of the
libc-alpha@sourceware.org
mailing list for the glibc project.
Re: [PATCH 2/4] Add atomic operations similar to those provided by C11.
- From: Torvald Riegel <triegel at redhat dot com>
- To: "Joseph S. Myers" <joseph at codesourcery dot com>
- Cc: GLIBC Devel <libc-alpha at sourceware dot org>
- Date: Thu, 30 Oct 2014 10:27:12 +0100
- Subject: Re: [PATCH 2/4] Add atomic operations similar to those provided by C11.
- Authentication-results: sourceware.org; auth=none
- References: <1414617613 dot 10085 dot 23 dot camel at triegel dot csb> <1414619416 dot 10085 dot 46 dot camel at triegel dot csb> <Pine dot LNX dot 4 dot 64 dot 1410292156440 dot 15119 at digraph dot polyomino dot org dot uk> <1414622734 dot 10085 dot 76 dot camel at triegel dot csb> <Pine dot LNX dot 4 dot 64 dot 1410292257040 dot 15119 at digraph dot polyomino dot org dot uk>
On Wed, 2014-10-29 at 23:06 +0000, Joseph S. Myers wrote:
> On Wed, 29 Oct 2014, Torvald Riegel wrote:
>
> > First, do you agree that we need to make the compiler aware of
> > concurrency? For example, it would be bad if the compiler assumes that
> > it can safely reload from an atomic variable just because it was able to
> > prove that the loading thread didn't change it in the meantime.
>
> I believe that the code already makes it explicit where such reloading
> would be problematic,
Why do you think this is the case? There is an atomic_forced_read, for
example, but no atomic_read_exactly_once.
> either (a) using asm with appropriate clobbers so
> the compiler doesn't know the value is unchanged
Do you have any examples for this where the clobber is not due to a
compiler barrier used for acquire or release fences?
> or (b) using volatile.
But volatile does not guarantee atomicity. I also think that current
compilers are probably unlikely to split a load or store into different
parts, but can we be sure about this? What if, for example, we have
adjacent atomic variables that are loaded from, and they happen to be
not perfectly aligned (e.g., that's fine for x86 atomic accesses IIRC),
and the compiler merges some of the accesses and thus splits the last
one? Can't we just do the right thing (and safe thing!) and tell the
compiler that this is an atomic access?
> In principle atomics could avoid optimization limitations from use of
> volatile (and allow more code motion than might be allowed by the asms),
> but it's not clear if they do improve on volatile at present.
(I suppose you mean relaxed MO atomic loads/stores. For everything else
I hope it's clear that atomics are an improvement...)
But I also see no real evidence that using atomics for relaxed
loads/stores would hurt at all. The sanitizer use case in the BZ you
cited is arguably special. The pthread_once case didn't show any
difference on the fast path. I can test with a __atomic*-providing
compiler with and without using __atomic to look for differences, if
that would help you.
> > If we assume that, we can either (1) use __atomic* and check all the
> > generated code, or (2) use inline asm, or (3) use volatile inline asm.
> > Any other options? Plain loads will not reliably make the compiler
> > aware that it has to take concurrent accesses into account.
>
> As noted, I think appropriate asms or volatile are already in place where
> this is an issue for a plain load.
We don't even have correct barriers everywhere, so I don't really have a
reason to agree to that.
Can we at least agree on having all glibc code use our own
atomic_store_relaxed / atomic_load_relaxed (see the patch)? Then we can
still argue whether to use __atomic* to implement these. But at least
we can easily switch to __atomic in the future. Any objections?