This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH] powerpc: remove linux lowlevellock.h


On 06-10-2014 09:49, Torvald Riegel wrote:
> On Thu, 2014-09-25 at 18:19 -0300, Adhemerval Zanella wrote:
>> On 24-09-2014 12:37, Torvald Riegel wrote:
>>> On Wed, 2014-09-24 at 10:58 -0300, Adhemerval Zanella wrote:
>>>> On 24-09-2014 10:29, Torvald Riegel wrote:
>>>>> On Wed, 2014-09-24 at 08:55 -0300, Adhemerval Zanella wrote:
>>>>>> On 24-09-2014 07:26, Torvald Riegel wrote:
>>>>>>> On Tue, 2014-09-23 at 14:35 -0300, Adhemerval Zanella wrote:
>>>>>>>> diff --git a/sysdeps/unix/sysv/linux/powerpc/pthread_spin_unlock.c b/sysdeps/unix/sysv/linux/powerpc/pthread_spin_unlock.c
>>>>>>>> index 2b8c84d..4906205 100644
>>>>>>>> --- a/sysdeps/unix/sysv/linux/powerpc/pthread_spin_unlock.c
>>>>>>>> +++ b/sysdeps/unix/sysv/linux/powerpc/pthread_spin_unlock.c
>>>>>>>> @@ -22,7 +22,7 @@
>>>>>>>>  int
>>>>>>>>  pthread_spin_unlock (pthread_spinlock_t *lock)
>>>>>>>>  {
>>>>>>>> -  __asm __volatile (__lll_rel_instr ::: "memory");
>>>>>>>> +  __asm __volatile (__ARCH_REL_INSTR ::: "memory");
>>>>>>>>    *lock = 0;
>>>>>>>>    return 0;
>>>>>>>>  }
>>>>>>> I'd prefer if this would be a memory_order_release barrier (which we
>>>>>>> don't have yet, I know, so bear with me).  However, I see that the
>>>>>>> current situation is a bit messy:
>>>>>>> * It seems that atomic_write_barrier is intended to be a C11
>>>>>>> memory_order_release barrier/fence; at least it's used like that in all
>>>>>>> what I saw in glibc.  So atomic_write_barrier would be our current way
>>>>>>> to express a release barrier.
>>>>>>> * However, PowerPC atomic_write_barrier is "eieio" and is not using
>>>>>>> __ARCH_REL_INSTR.  Can you clarify the difference?
>>>>>> These two pages [1] [2] have detailed information about C11/C++11 memory order
>>>>>> mappings to PowerPC instructions.
>>>>> I'm aware of these ...
>>>>>
>>>>>> As you can see the store release is indeed 'lwsync; st'.  First link also
>>>>>> gives you an example of a spinlock implementation, however it is using a 
>>>>>> full memory barrier (sync) instead of a write one (lwsync). It is better
>>>>>> explained in second link:
>>>>>>
>>>>>> "If the critical section contains only normal cached memory operations, 
>>>>>> the lwsync barrier may be substituted for the heavier-weight sync instruction
>>>>>> above."
>>>>>>
>>>>>> Non-cached memory are handled by kernel and exported to application only in very
>>>>>> specific situations (and afaik C11/C++11 specifications don't even handle it).
>>>>>> The idea is a process won't try to run a spinlock in a DMA area. 
>>>>>>
>>>>>> Also the 'Synchronizes With' gives some explanation about 'eieio' and 'lwsync'
>>>>>> relation. My understanding is 'lwsync' is the correct memory order instruction
>>>>>> to use in such cases.
>>>>>>
>>>>>> [1] http://www.cl.cam.ac.uk/~pes20/cpp/cpp0xmappings.html
>>>>>> [2] http://www.rdrop.com/users/paulmck/scalability/paper/N2745r.2011.03.04a.html
>>>>> ... but not of the difference between eieio and lwsync.  AFAICT, N2745
>>>>> states that eieio really applies only to stores.
>>>>>
>>>>> What are your thoughts about changing atomic_write_barrier to eieio?
>>>>> Are you aware of any prior reasoning about that (e.g., whether
>>>>> atomic_write_barrier really only needs to prevent reordering of writes)?
>>>>> Would there be a significant performance degradation?
>>>> Reading again my last message I think I wasn't clear: my understanding of
>>>> 'eieio' is to be a safe atomic_write_barrier for cacheable and cache inhibited
>>>> memory. And it is preferred over 'sync' for performance reasons.
>>>>
>>>> Now, for cacheable memory (default memory allocated from kernel), a 'lwsync'
>>>> is a safe atomic_write_barrier and it also performance better than 'eieio'
>>>> (and AFAIK GCC won't emit 'eieio' for C11/C++11 atomic operations). That's why 
>>>> I see no need to change it to 'eieio' in the pthread_spin_unlock.
>>> Sorry, when I wrote "changing atomic_write_barrier to eieio" I actually
>>> meant changing it to using lwsync :)
>>> So, what are your thoughts about that?
>> Right, I thought you were solely referring to pthread_spin_unlock.  About the
>> spinlock itself, now I understand the reasoning and I do agree that about 
>> replace the asm with a macro to emit a release barrier (which for POWER
>> would be the same anyway).  However, I would like to address it in another
>> patch.
>>
>> Now for atomic_write_barrier, as I understand, was added in GLIBC as 
>> memory_order_seq_cst semantics.
> Do you have any indication for that being the case (e.g., emails by
> others)?  It would be surprising because atomic_full_barrier exists as
> well.
>
In fact I was wrongly mislead by a wrong assumption of 'eieio' being 'stronger' than
lwsync, which the contrary in fact.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]