This is the mail archive of the
libc-alpha@sourceware.org
mailing list for the glibc project.
Re: [PATCH][BZ #19490] Add unwind descriptors to pthread_spin_init, etc. on i386
- From: Szabolcs Nagy <szabolcs dot nagy at arm dot com>
- To: Torvald Riegel <triegel at redhat dot com>, Paul Pluzhnikov <ppluzhnikov at google dot com>
- Cc: GLIBC Devel <libc-alpha at sourceware dot org>, <nd at arm dot com>
- Date: Mon, 1 Feb 2016 12:03:57 +0000
- Subject: Re: [PATCH][BZ #19490] Add unwind descriptors to pthread_spin_init, etc. on i386
- Authentication-results: sourceware.org; auth=none
- Nodisclaimer: True
- References: <CALoOobPJLwc+iSG+w-2YqqiL6=iAL-ZPiS5iKtmdDmJTR9Fp6g at mail dot gmail dot com> <1453727190 dot 4592 dot 91 dot camel at localhost dot localdomain> <CALoOobP6+GGSCMwiXsgQY4zLCXKpSsOA=4fOdB+J9q20bmtWOw at mail dot gmail dot com> <1454326614 dot 4592 dot 293 dot camel at localhost dot localdomain>
- Spamdiagnosticmetadata: NSPM
- Spamdiagnosticoutput: 1:23
On 01/02/16 11:36, Torvald Riegel wrote:
> On Sun, 2016-01-31 at 15:09 -0800, Paul Pluzhnikov wrote:
>> On Mon, Jan 25, 2016 at 5:06 AM, Torvald Riegel <triegel@redhat.com> wrote:
>>
>>> For the spinlocks, I'd really prefer if we could just remove the asm
>>> files. The generic implementation should basically produce the same
>>> code; if not, we should try to fix that instead of keeping the asm
>>> files.
>>
>> Using gcc-4.8.4 (4.8.4-2ubuntu1~14.04):
>>
>> $ objdump -d nptl/pthread_spin_unlock.o
>>
>> nptl/pthread_spin_unlock.o: file format elf32-i386
>>
>>
>> Disassembly of section .text:
>>
>> 00000000 <pthread_spin_unlock>:
>> 0: f0 83 0c 24 00 lock orl $0x0,(%esp)
>> 5: 8b 44 24 04 mov 0x4(%esp),%eax
>> 9: c7 00 00 00 00 00 movl $0x0,(%eax)
>> f: 31 c0 xor %eax,%eax
>> 11: c3 ret
>>
>> This isn't quite the same as sysdeps/i386/nptl/pthread_spin_unlock.S
>
> This is because nptl/pthread_spin_unlock.c still issues a full barrier.
> If this is changed to an atomic_store_release, one gets this on x86_64:
>
> 0000000000000000 <pthread_spin_unlock>:
> 0: c7 07 00 00 00 00 movl $0x0,(%rdi)
> 6: 31 c0 xor %eax,%eax
> 8: c3
>
> Perhaps now is a good time to finally get this done. Most archs are
> already using acquire semantics, IIRC. I think aarch64 and arm are the
> only major ones that happen to use the current generic unlock with full
> barrier -- but they only use acquire MO on unlock, so there's really no
> consistent pattern that would justify this.
i think mb(); store(); is actually *weaker* than store_release();
and thus on some hw it might be a bit faster, but i'm not against
changing to store_release (that's one step closer to posix semantics).
(full barrier is weaker here because store_release() has to
prevent reordering wrt load_acquire in *both* directions, so
it may be implemented by the hw like mb(); store(); mb();
although that's not the most efficient implementation..)
> Note that there's an ongoing debate about whether POSIX requires
> pthread_spin_unlock to be a full barrier, whether it should or should
the current unlock is not enough for posix if trylock is
acquire MO:
T1:
unlock(l1);
if (trylock(l2))...
T2:
unlock(l2);
if (trylock(l1))...
with one sided barrier, both trylock can fail to grab
the lock (the loads are not guaranteed to happen after
the stores) which is not seq cst, this does not happen
with release MO unlock.