This is the mail archive of the
libc-alpha@sourceware.org
mailing list for the glibc project.
Re: [PATCHv2] powerpc: Spinlock optimization and cleanup
- From: Torvald Riegel <triegel at redhat dot com>
- To: Szabolcs Nagy <szabolcs dot nagy at arm dot com>
- Cc: "Paul E. Murphy" <murphyp at linux dot vnet dot ibm dot com>, "libc-alpha at sourceware dot org" <libc-alpha at sourceware dot org>, "rth at twiddle dot net" <rth at twiddle dot net>, Tulio Magno Quites Machado Filho <tuliom at linux dot vnet dot ibm dot com>, Adhemerval Zanella <adhemerval dot zanella at linaro dot org>, Steve Munroe <sjmunroe at us dot ibm dot com>
- Date: Thu, 01 Oct 2015 21:17:06 +0200
- Subject: Re: [PATCHv2] powerpc: Spinlock optimization and cleanup
- Authentication-results: sourceware.org; auth=none
- References: <560C0DA6 dot 5060409 at linux dot vnet dot ibm dot com> <560CFA64 dot 2030205 at arm dot com>
On Thu, 2015-10-01 at 10:18 +0100, Szabolcs Nagy wrote:
> On 30/09/15 17:28, Paul E. Murphy wrote:
> >
> > ---8<---
> > This patch optimizes powerpc spinlock implementation by:
> >
> ...
>
> The glibc pthread spinlock semantics is weaker than what
> posix requires, I'm wondering if this is expected to stay
> or glibc might want to switch to stronger semantics.
I think this should stay the way it is. Thus, do what C++ and soo also
C11 (http://www.open-std.org/jtc1/sc22/wg14/www/docs/summary.htm#dr_470)
specify. Making this (it's all the mtx operations that succeed, not
just trylock) seqcst would decrease performance for the most common case
just to make arcane cases work (e.g., abusing POSIX synchronization
functions such as trylock or sem_getvalue as atomics).
Fixing that at the POSIX level would require POSIX to use a more
involved memory model (hopefully following the C11 model). If anyone
feels like contributing to make this happen, please do so.
> is it worthwhile to add optimized asm with weak semantics
> for other targets that currently use the generic c code?
I think only nptl/pthread_spin_unlock.c should be changed, the other
generic functions use weaker memory orders already. OTOH, this would
change existing behavior, so one can argue this is more risky than
keeping weaker-than-POSIX implementations unchanged.
> (the issue is that for correct pthread_spin_trylock behavior
> the lock should be seqcst instead of acquire and the unlock
> should be release instead of barrier+store otherwise trylock
> can spuriously report locked state).
Right now, unlock is a full barrier (ie, seqcst) plus store. That is
stronger than a release store. Also note that a failing POSIX
synchronization function is not supposed to synchronize memory. So, a
failing trylock doesn't help a program unless it synchronizes through
some other way, in which case this other way will "provide" the
barriers.