[PATCH] malloc: Optimize small memory zeroing for calloc
H.J. Lu
hjl.tools@gmail.com
Wed Nov 27 07:45:54 GMT 2024
For memory size up to 9 * INTERNAL_SIZE_T bytes, calloc has special codes
to clear the memory. Add calloc-clear-memory.h to allow architecture
specific optimization. On x86-64, it uses up to 1 branch, instead of 3,
and up to 5 stores, instead of 9, by using overlapping vector stores:
Test Platform: Xeon-8380
Bench Function: calloc
Ratio: New / Original time_per_iteration (Lower is Better)
Threads# | Ratio
-----------|------
1 thread | 0.953
4 threads | 0.952
OK for master?
Thanks.
--
H.J.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0001-malloc-Optimize-small-memory-zeroing-for-calloc.patch
Type: text/x-patch
Size: 5780 bytes
Desc: not available
URL: <https://sourceware.org/pipermail/libc-alpha/attachments/20241127/629e5b41/attachment.bin>
More information about the Libc-alpha
mailing list