This is the mail archive of the
libc-alpha@sourceware.org
mailing list for the glibc project.
Re: [PATCH] malloc: Use current (C11-style) atomics for fastbin access
- From: Anton Blanchard <anton at ozlabs dot org>
- To: Florian Weimer <fweimer at redhat dot com>
- Cc: libc-alpha at sourceware dot org, Tulio Magno Quites Machado Filho <tuliom at linux dot vnet dot ibm dot com>, Paul Clarke <pc at us dot ibm dot com>, Bill Schmidt <wschmidt at us dot ibm dot com>
- Date: Wed, 16 Jan 2019 17:18:38 +1100
- Subject: Re: [PATCH] malloc: Use current (C11-style) atomics for fastbin access
- References: <87va52nupb.fsf@oldenburg.str.redhat.com> <20190116092655.151bdbbd@kryten> <87won5zgz5.fsf@oldenburg2.str.redhat.com>
Hi Florian,
> > I see a 16% regression on ppc64le with a simple threaded malloc test
> > case. I guess the C11 atomics aren't as good as what we have in
> > glibc.
>
> Uh-oh. Would you please check if replacing the two
> atomic_load_acquire with atomic_load_relaxed restore the previous
> performance?
As you suspect, doing this does restore the performance. The two lwsync
barrier instructions must be causing the slow down.
> I believe both loads need to be acquire MO under the C11 memory model
> (see the comments why), but the old code did not have them.
Ok, thanks for looking into it.
Thanks,
Anton