This is the mail archive of the
libc-alpha@sourceware.org
mailing list for the glibc project.
Re: Optimization of conditional stores (was: Re: [PATCH] Add adaptive elision to rwlocks)
- From: Alexander Monakov <amonakov at ispras dot ru>
- To: Andi Kleen <andi at firstfloor dot org>
- Cc: Roland McGrath <roland at hack dot frob dot com>, Andi Kleen <ak at linux dot intel dot com>, libc-alpha at sourceware dot org
- Date: Mon, 7 Apr 2014 20:54:57 +0400 (MSK)
- Subject: Re: Optimization of conditional stores (was: Re: [PATCH] Add adaptive elision to rwlocks)
- Authentication-results: sourceware.org; auth=none
- References: <1396652083-18920-1-git-send-email-andi at firstfloor dot org> <20140404234516 dot 3DFAD74446 at topped-with-meat dot com> <20140405003759 dot GQ32556 at tassilo dot jf dot intel dot com> <20140405044201 dot 9B44D74445 at topped-with-meat dot com> <alpine dot LNX dot 2 dot 00 dot 1404071824530 dot 2531 at monopod dot intra dot ispras dot ru> <20140407161055 dot GV22728 at two dot firstfloor dot org>
On Mon, 7 Apr 2014, Andi Kleen wrote:
> > If the compiler can prove that `ptr' must be pointing to writeable location
> > (for instance if there is a preceding (dominating) unconditional store), it
> > can, and likely will, perform the optimization.
>
> Except it's not an optimization, but a pessimization
I see where you're coming from, but is that really a pessimization for a case
of non-multithreaded execution? Also, I (of course) agree with Jeff Law that
such transformation has good chances of violating the memory model imposed by
newer standards.
> Which compiler would do that? It sounds very broken to me.
Example:
void foo(int * __restrict__ ptr, int val, volatile int * __restrict__ cond)
{
*ptr = 0;
while (*cond);
if (*ptr != val)
*ptr = val;
}
In my tests, GCC versions before 4.8 optimize out the first store and the
conditional branch. GCC 4.8.0 preserves both the first store and the branch.
If you omit the busy-wait loop, GCC 4.8 performs the optimization as well.
> > I would also suggest making the intent (perform the store only when necessary)
> > explicit, and make sure to disallow the compiler optimization, for example:
> >
> > if (*ptr != value)
> > *(volatile typeof(*ptr)*)ptr = value;
>
> That's really ugly.
I simply expanded kernel's ACCESS_ONCE macro by hand for the sake of the
example.
Alexander