[PATCH v3 3/5] malloc: Arena is not needed for tcache path in free()

Guo, Wangyang wangyang.guo@intel.com
Tue Nov 26 03:41:17 GMT 2024


On 11/26/2024 11:23 AM, H.J. Lu wrote:

>
>
> On Tue, Nov 26, 2024, 10:55 AM Guo, Wangyang <wangyang.guo@intel.com> 
> wrote:
>
>     On 11/25/2024 7:58 PM, H.J. Lu wrote:
>
>>     On Mon, Nov 25, 2024 at 6:34 PM H.J. Lu<hjl.tools@gmail.com> <mailto:hjl.tools@gmail.com> wrote:
>>>     On Thu, Aug 29, 2024 at 2:31 PM Wangyang Guo<wangyang.guo@intel.com> <mailto:wangyang.guo@intel.com> wrote:
>>>>     Arena is not needed for _int_free_check() in non-DEBUG mode. This commit
>>>>     defers arena deference to _int_free_chunk() thus accelerate tcache path.
>>>>     When DEBUG enabled, arena can be obtained from p in do_check_inuse_chunk().
>>>>
>>>>     Result of bench-malloc-thread benchmark
>>>>
>>>>     Test Platform: Xeon-8380
>>>>     Ratio: New / Original time_per_iteration (Lower is Better)
>>>>
>>>>     Threads#   | Ratio
>>>>     -----------|------
>>>>     1 thread   | 0.994
>>>>     4 threads  | 0.968
>>>>
>>>>     The data shows it can brings 3% performance gain in multi-thread scenario.
>>>>
>>>>     ---
>>>>     Changes in v2:
>>>>     - _int_free_check() should be put outside of USE_TCACHE.
>>>>     - Link to v1:https://sourceware.org/pipermail/libc-alpha/2024-August/159360.html
>>>>     ---
>>>>       malloc/malloc.c | 10 ++++++++--
>>>>       1 file changed, 8 insertions(+), 2 deletions(-)
>>>>
>>>>     diff --git a/malloc/malloc.c b/malloc/malloc.c
>>>>     index 264f35e1a3..efb5292e9f 100644
>>>>     --- a/malloc/malloc.c
>>>>     +++ b/malloc/malloc.c
>>>>     @@ -2143,6 +2143,9 @@ do_check_inuse_chunk (mstate av, mchunkptr p)
>>>>       {
>>>>         mchunkptr next;
>>>>
>>>>     +  if (av == NULL)
>>>>     +    av = arena_for_chunk (p);
>>>>     +
>>>>         do_check_chunk (av, p);
>>>>
>>>>         if (chunk_is_mmapped (p))
>>>>     @@ -3447,9 +3450,10 @@ __libc_free (void *mem)
>>>>             /* Mark the chunk as belonging to the library again.  */
>>>>             (void)tag_region (chunk2mem (p), memsize (p));
>>>>
>>>>     -      ar_ptr = arena_for_chunk (p);
>>>>             INTERNAL_SIZE_T size = chunksize (p);
>>>>     -      _int_free_check (ar_ptr, p, size);
>>>>     +      /* av is not needed for _int_free_check in non-DEBUG mode,
>>>>     +        in DEBUG mode, av will fetch from p in do_check_inuse_chunk.  */
>>>>     +      _int_free_check (NULL, p, size);
>>>     If _int_free_check doesn't need av, can you push it further down?
>>>
>>     Something like this?
>     The problem is _init_free is also used in other places. We can not
>     make sure `av` always got from `arena_for_chunk (p)`.
>
>
> Have you looked at my change?

Your patch try to remove `av` parameters in _int_free_check:

@@ -4684,7 +4693,7 @@ _int_free (mstate av, mchunkptr p, int have_lock)

    size = chunksize (p);

-  _int_free_check (av, p, size);
+  _int_free_check (p, size);

And fetch `av` by `arena_for_chunk (p)`.

+static inline void +do_check_inuse_chunk_fast (mstate av, mchunkptr p) 
+{ + mstate av = arena_for_chunk (p); + do_check_inuse_chunk (av, p); +}

My point is for _int_free(av, p, have_lock), if `av` is not equals to 
`arena_for_chunk (p)`, then we can not simply omit it from 
`_int_free_check`.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://sourceware.org/pipermail/libc-alpha/attachments/20241126/88154c7d/attachment.htm>


More information about the Libc-alpha mailing list