This is the mail archive of the mailing list for the glibc project.

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: Transition to C11 atomics and memory model

On Tue, 2014-09-16 at 18:17 +0000, Joseph S. Myers wrote:
> On Sun, 14 Sep 2014, Torvald Riegel wrote:
> > * All accesses to atomic vars need to use atomic_* functions.  IOW, all
> > non-atomic accesses are not subject to data races.  The only exceptions
> > is initialization (ie, when the variable is not visible to any other
> > thread); nonetheless, initialization accesses must not result in data
> > races with other accesses.  (This exception isn't allowed by C11, but
> > eases the transition to C11 atomics and likely works fine in current
> > implementations; as alternative, we could require MO-relaxed stores for
> > initialization as well.)
> Note <> regarding 
> MO-relaxed operations not being well-optimized.

I mentioned this already in my reply to a question by Carlos (in this
thread) -- but thanks for noting the GCC bug:

* Compilers that don't optimize across memory_order_relaxed atomic ops
and glibc code actually benefits from optimizations by the compiler
across current plain memory accesses.  I doubt that this actually
happens in practice, because it would need a loop or such and other
things in the loop would need to be performance-critical -- which is not
a pattern I think is frequent in concurrent code.
* If we currently have code where the compiler combines several plain
memory accesses to concurrently accessed data into one, then we could
have more accesses if using memory_order_relaxed atomics.  However, such
an optimization can easily be not what the programmer intended to happen
(e.g., if in a busy-waiting loop -- hence atomic_forced_read...).

> It will be important to 
> compare the code generated before and after any changes, and may be 
> necessary to map the "relaxed" operations to plain loads and stores in 
> some cases depending on the compiler version.

I agree that we need to compare, but I wouldn't be happy with mapping
back to plain loads and stores either.  Sure, it seems to work right
now, but we're really keeping our fingers cross, actually.

My *guess* would be that we won't see slow-down for any of the pthreads
synchronization data structures, but might see some in cases where
atomics are uses for things like statistics counters in malloc or such
(ie, where they are surrounded by significant amounts of nonconcurrent
code without additional memory barriers and such).

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]