This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

Re: [PATCH] powerpc: remove linux lowlevellock.h

From: Adhemerval Zanella <azanella at linux dot vnet dot ibm dot com>
To: libc-alpha at sourceware dot org
Date: Wed, 24 Sep 2014 08:55:14 -0300
Subject: Re: [PATCH] powerpc: remove linux lowlevellock.h
Authentication-results: sourceware.org; auth=none
References: <5421AF5E dot 8030402 at linux dot vnet dot ibm dot com> <1411554406 dot 27838 dot 93 dot camel at triegel dot csb>

On 24-09-2014 07:26, Torvald Riegel wrote:
> On Tue, 2014-09-23 at 14:35 -0300, Adhemerval Zanella wrote:
>> diff --git a/sysdeps/unix/sysv/linux/powerpc/pthread_spin_unlock.c b/sysdeps/unix/sysv/linux/powerpc/pthread_spin_unlock.c
>> index 2b8c84d..4906205 100644
>> --- a/sysdeps/unix/sysv/linux/powerpc/pthread_spin_unlock.c
>> +++ b/sysdeps/unix/sysv/linux/powerpc/pthread_spin_unlock.c
>> @@ -22,7 +22,7 @@
>>  int
>>  pthread_spin_unlock (pthread_spinlock_t *lock)
>>  {
>> -  __asm __volatile (__lll_rel_instr ::: "memory");
>> +  __asm __volatile (__ARCH_REL_INSTR ::: "memory");
>>    *lock = 0;
>>    return 0;
>>  }
> I'd prefer if this would be a memory_order_release barrier (which we
> don't have yet, I know, so bear with me).  However, I see that the
> current situation is a bit messy:
> * It seems that atomic_write_barrier is intended to be a C11
> memory_order_release barrier/fence; at least it's used like that in all
> what I saw in glibc.  So atomic_write_barrier would be our current way
> to express a release barrier.
> * However, PowerPC atomic_write_barrier is "eieio" and is not using
> __ARCH_REL_INSTR.  Can you clarify the difference?

These two pages [1] [2] have detailed information about C11/C++11 memory order
mappings to PowerPC instructions.

As you can see the store release is indeed 'lwsync; st'.  First link also
gives you an example of a spinlock implementation, however it is using a 
full memory barrier (sync) instead of a write one (lwsync). It is better
explained in second link:

"If the critical section contains only normal cached memory operations, 
the lwsync barrier may be substituted for the heavier-weight sync instruction
above."

Non-cached memory are handled by kernel and exported to application only in very
specific situations (and afaik C11/C++11 specifications don't even handle it).
The idea is a process won't try to run a spinlock in a DMA area. 

Also the 'Synchronizes With' gives some explanation about 'eieio' and 'lwsync'
relation. My understanding is 'lwsync' is the correct memory order instruction
to use in such cases.

[1] http://www.cl.cam.ac.uk/~pes20/cpp/cpp0xmappings.html
[2] http://www.rdrop.com/users/paulmck/scalability/paper/N2745r.2011.03.04a.html


>
> I'm interested in having a release barrier there so that this won't get
> overlooked once we do the actual transition to C11 atomics.
>
> Also, related to that, it would be ideal if we could use the generic
> implementation (nptl/pthread_spin_unlock.c) on Power as well.  However,
> this one uses atomic_full_barrier, which is stronger than what's used on
> the major archs (x86, Power, ARM).
> I strongly believe this should use a release barrier, despite what POSIX
> is claiming (see #16432), but we'll have to deal with that separately.

I believe this is exactly why we have a specific POWER implementation: atomic
full barrier in powerpc are 'sync' and for this spinlock cases 'lwsync' is suffice.

>
>> diff --git a/sysdeps/unix/sysv/linux/powerpc/sem_post.c b/sysdeps/unix/sysv/linux/powerpc/sem_post.c
>> index f222d9a..831a8dd 100644
>> --- a/sysdeps/unix/sysv/linux/powerpc/sem_post.c
>> +++ b/sysdeps/unix/sysv/linux/powerpc/sem_post.c
>> @@ -30,9 +30,9 @@ __new_sem_post (sem_t *sem)
>>  {
>>    struct new_sem *isem = (struct new_sem *) sem;
>>  
>> -  __asm __volatile (__lll_rel_instr ::: "memory");
>> +  __asm __volatile (__ARCH_REL_INSTR ::: "memory");
>>    atomic_increment (&isem->value);
>> -  __asm __volatile (__lll_acq_instr ::: "memory");
>> +  __asm __volatile (__ARCH_ACQ_INSTR ::: "memory");
>>    if (isem->nwaiters > 0)
>>      {
>>        int err = lll_futex_wake (&isem->value, 1,
>> @@ -55,7 +55,7 @@ __old_sem_post (sem_t *sem)
>>  {
>>    int *futex = (int *) sem;
>>  
>> -  __asm __volatile (__lll_rel_instr ::: "memory");
>> +  __asm __volatile (__ARCH_REL_INSTR ::: "memory");
>>    (void) atomic_increment_val (futex);
>>    /* We always have to assume it is a shared semaphore.  */
>>    int err = lll_futex_wake (futex, 1, LLL_SHARED);
> These are fine, though, as we'll need to change the semaphore algorithm
> anyway to fulfill the destruction guarantees (I have a WIP
> implementation that I plan to post in the near future).
>
>
>

Follow-Ups:
- Re: [PATCH] powerpc: remove linux lowlevellock.h
  - From: Torvald Riegel

References:
- [PATCH] powerpc: remove linux lowlevellock.h
  - From: Adhemerval Zanella
- Re: [PATCH] powerpc: remove linux lowlevellock.h
  - From: Torvald Riegel

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]