This is the mail archive of the
libc-alpha@sourceware.org
mailing list for the glibc project.
Re: [PATCH 1/2] Single thread optimization for malloc atomics
- From: OndÅej BÃlka <neleai at seznam dot cz>
- To: Torvald Riegel <triegel at redhat dot com>
- Cc: Adhemerval Zanella <azanella at linux dot vnet dot ibm dot com>, "GNU C. Library" <libc-alpha at sourceware dot org>
- Date: Fri, 2 May 2014 18:38:51 +0200
- Subject: Re: [PATCH 1/2] Single thread optimization for malloc atomics
- Authentication-results: sourceware.org; auth=none
- References: <53610133 dot 3070908 at linux dot vnet dot ibm dot com> <1399041298 dot 32485 dot 6373 dot camel at triegel dot csb>
On Fri, May 02, 2014 at 04:34:58PM +0200, Torvald Riegel wrote:
> On Wed, 2014-04-30 at 10:57 -0300, Adhemerval Zanella wrote:
> > This patch adds a single-thread optimization for malloc atomic usage to
> > first check if process is single-thread (ST) and if so use normal
> > load/store instead of atomic instructions.
> >
> > It is a respin of my initial attempt to add ST optimization on PowerPC.
> > Now malloc optimization is arch-agnostic and rely in already defined macros
> > in GLIBC (SINGLE_THREAD_P).
>
> I'm not convinced that this is a good change. First, I'd like to see
> some performance numbers that show whether the change leads to a
> non-negligible increase in performance. That should be a
> microbenchmark, so we can track it over time.
>
> malloc is currently marked as Async-Signal-Unsafe, but due to locks
> elsewhere in the implementation. The MTSafety docs even talk about what
> would be needed to make it AS-Safe. Your patch would prevent making it
> AS-Safe. The catomic_add that you replace are AS-Safe, the sequential
> code you add is not. Someone added those catomic_* and not sequential
> code, so this is at least indication that there's a disconnect here.
>
> You also add macros with nondescriptive names, that each have exactly
> one use. There's no documentation for how they differ, and you don't
> update the comments on the MT-Safe docs.
>
> > x86 probably would kike an atomic.h refactor
> > to avoid the double check for '__local_multiple_threads', but that is not
> > the aim of this patch.
>
> But your patch does introduce this, so you should take care of this as
> well. I guess you wouldn't be happy if someone change x86 without
> caring about Power?
Also as tls is fast on power this change is retundant. A performance
difference with properly implemented thread cache is neglitible so this
code would be soon removed.