This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

Re: Optimization of conditional stores (was: Re: [PATCH] Add adaptive elision to rwlocks)

From: Torvald Riegel <triegel at redhat dot com>
To: Alexander Monakov <amonakov at ispras dot ru>
Cc: Andi Kleen <andi at firstfloor dot org>, Roland McGrath <roland at hack dot frob dot com>, Andi Kleen <ak at linux dot intel dot com>, libc-alpha at sourceware dot org
Date: Thu, 10 Apr 2014 22:26:07 +0200
Subject: Re: Optimization of conditional stores (was: Re: [PATCH] Add adaptive elision to rwlocks)
Authentication-results: sourceware.org; auth=none
References: <1396652083-18920-1-git-send-email-andi at firstfloor dot org> <20140404234516 dot 3DFAD74446 at topped-with-meat dot com> <20140405003759 dot GQ32556 at tassilo dot jf dot intel dot com> <20140405044201 dot 9B44D74445 at topped-with-meat dot com> <alpine dot LNX dot 2 dot 00 dot 1404071824530 dot 2531 at monopod dot intra dot ispras dot ru> <20140407161055 dot GV22728 at two dot firstfloor dot org> <alpine dot LNX dot 2 dot 00 dot 1404072027420 dot 2624 at monopod dot intra dot ispras dot ru>

On Mon, 2014-04-07 at 20:54 +0400, Alexander Monakov wrote:
> 
> On Mon, 7 Apr 2014, Andi Kleen wrote:
> 
> > > If the compiler can prove that `ptr' must be pointing to writeable location
> > > (for instance if there is a preceding (dominating) unconditional store), it
> > > can, and likely will, perform the optimization.
> > 
> > Except it's not an optimization, but a pessimization
> 
> I see where you're coming from, but is that really a pessimization for a case
> of non-multithreaded execution?  Also, I (of course) agree with Jeff Law that
> such transformation has good chances of violating the memory model imposed by
> newer standards.
> 
> > Which compiler would do that? It sounds very broken to me.
> 
> Example:
> 
> void foo(int * __restrict__ ptr, int val, volatile int * __restrict__ cond)
> {
>   *ptr = 0;
>   while (*cond);
>   if (*ptr != val)
>     *ptr = val;
> }
> 
> In my tests, GCC versions before 4.8 optimize out the first store and the
> conditional branch.  GCC 4.8.0 preserves both the first store and the branch.
> If you omit the busy-wait loop, GCC 4.8 performs the optimization as well.

If we consider just the standards (which don't provide for something
like read-only memory, I believe (and ptr isn't volatile)), then I think
both pre 4.8 and 4.8 behavior are correct.  I don't know whether that's
actually the intention, but 4.8 might treat the while loop as
synchronization (which it isn't according to C11/C++11) and thus not
merge the stores.

References:
- [PATCH] Add adaptive elision to rwlocks
  - From: Andi Kleen
- Re: [PATCH] Add adaptive elision to rwlocks
  - From: Roland McGrath
- Re: [PATCH] Add adaptive elision to rwlocks
  - From: Andi Kleen
- Re: [PATCH] Add adaptive elision to rwlocks
  - From: Roland McGrath
- Optimization of conditional stores (was: Re: [PATCH] Add adaptive elision to rwlocks)
  - From: Alexander Monakov
- Re: Optimization of conditional stores (was: Re: [PATCH] Add adaptive elision to rwlocks)
  - From: Andi Kleen
- Re: Optimization of conditional stores (was: Re: [PATCH] Add adaptive elision to rwlocks)
  - From: Alexander Monakov

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]