This is the mail archive of the
libc-alpha@sourceware.org
mailing list for the glibc project.
Re: [RFC] atomics vs. uniprocessor builds
- From: Torvald Riegel <triegel at redhat dot com>
- To: Richard Henderson <rth at twiddle dot net>
- Cc: GLIBC Devel <libc-alpha at sourceware dot org>
- Date: Tue, 02 Dec 2014 11:34:02 +0100
- Subject: Re: [RFC] atomics vs. uniprocessor builds
- Authentication-results: sourceware.org; auth=none
- References: <1416916695 dot 1771 dot 208 dot camel at triegel dot csb> <547D1CBD dot 7060208 at twiddle dot net>
On Tue, 2014-12-02 at 11:58 +1000, Richard Henderson wrote:
> On 11/25/2014 09:58 PM, Torvald Riegel wrote:
> > On powerpc and alpha, the current behavior could be achieved with the
> > builtins by, for each call to a builtin, use just memory_order_relaxed
> > for the memory order argument(s) and surround the call with custom
> > compiler barriers depending on the original MO. Examples:
> >
> > * __atomic_store_n ((mem), (val), __ATOMIC_RELEASE) becomes:
> > __asm ("" ::: "memory")
> > __atomic_store_n ((mem), (val), __ATOMIC_RELAXED);
> >
> > * __atomic_load_n ((mem), __ATOMIC_ACQUIRE) becomes:
> > __atomic_load_n ((mem), __ATOMIC_RELAXED)
> > __asm ("" ::: "memory")
> >
> > * __atomic_fetch_add ((mem), (operand), __ATOMIC_SEQ_CST)
> > __asm ("" ::: "memory")
> > __atomic_fetch_add ((mem), (operand), __ATOMIC_RELAXED)
> > __asm ("" ::: "memory")
> >
> > We need the additional compiler barriers because the memory order
> > argument is both a request for HW barriers (as necessary) and an
> > indication to the compiler which optimizations are allowed (e.g.,
> > reordering accross a __ATOMIC_RELAXED atomic is possible in many cases).
>
> First, I agree with Joseph that we should just remove the unused UP case.
>
> Second, I'm not sure an extra asm really would have been needed, at least for
> gcc 5. The memory barrier argument to the builtin generally doesn't affect the
> presence of the unspec_volatile, or the associated volatile memory, which
> should prevent movement across the builtin just as effectively as the asm.
I added them because the compiler could optimize relaxed-MO atomics more
aggressively than acquire-MO atomics. If gcc 5 doesn't do any of this
yet, I agree that it's not necessary in this case.