This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

Re: Malloc improvements

From: "Tulio Magno Quites Machado Filho" <tuliom at linux dot vnet dot ibm dot com>
To: Florian Weimer <fweimer at redhat dot com>, Anton Blanchard <anton at au1 dot ibm dot com>
Cc: "Carlos O'Donell" <carlos at redhat dot com>, Siddhesh Poyarekar <sid at reserved-bit dot com>, DJ Delorie <dj at redhat dot com>, libc-alpha at sourceware dot org
Cc:
Date: Fri, 15 Jul 2016 09:54:58 -0300
Subject: Re: Malloc improvements
Authentication-results: sourceware.org; auth=none
References: <20160712101010.6e6cfecb@kryten> <5a954ab2-d74c-867d-e427-ffae95389beb@redhat.com> <20160714214910.6727c439@kryten> <8b72c439-a9c3-4cfd-f9a1-f67836ea4795@redhat.com>

Florian Weimer <fweimer@redhat.com> writes:

> On 07/14/2016 01:49 PM, Anton Blanchard wrote:
>>>> It's great to see the current focus on improving malloc. One thing
>>>> that would really help POWER is reducing the number of locks and
>>>> atomics in the fast path. Right now we have 3 in the malloc
>>>> fastpath and 2 in free. These add up.
>>>
>>> Does the hook variable read count as an atomic operation in this
>>> sense?
>>
>> The read hook shouldn't be. The atomic issue I was referring to was
>> something we've been trying to solve for a while:
>>
>> https://sourceware.org/ml/libc-alpha/2014-05/msg00118.html
>
> x86_64 checks __libc_multiple_threads and avoids atomics if possible. 
> Do you already do this in POWER?

This was our last try:
http://patchwork.sourceware.org/patch/11307/

In summary, Torvald said:

    We need to be consistent where we try to optimize for single-threaded
    executions.  Currently, we do in catomic_* and I believe in some pieces
    of code using atomics.  The atomic_* functions, even the old ones,
    should not do that.
    Eventually, we should put special cases for single-threaded executions
    into the code using atomics and not into the atomics (also see above,
    phasing out catomic_*) because avoiding concurrent algorithms altogether
    is even faster than doing something in atomics (eg, one can avoid CAS
    loops altogether if there's no other thread because the CAS will never
    fail).
    Another reason to do this is that this adds the overhead of the
    single-thread check to all atomics, even in cases where it's clear that
    the code will be used often in a multi-threaded setting.

-- 
Tulio Magno

References:
- Malloc improvements
  - From: Anton Blanchard
- Re: Malloc improvements
  - From: Florian Weimer
- Re: Malloc improvements
  - From: Anton Blanchard
- Re: Malloc improvements
  - From: Florian Weimer

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]