[PATCH] malloc: Optimize small memory zeroing for calloc
H.J. Lu
hjl.tools@gmail.com
Wed Nov 27 21:51:15 GMT 2024
On Wed, Nov 27, 2024, 7:25 PM Wilco Dijkstra <Wilco.Dijkstra@arm.com> wrote:
> Hi H.J.,
>
> If this code is performance critical, it would be useful to optimize the
> generic version first and only consider target-specific code when you
> can beat the generic version by a large margin. For example using
> fixed-size inlined memsets should be faster on most targets and likely
>
I couldn't find a better way which works for all targets.
emit similar code on x86.
>
Very unlikely since the size isn't fixed and x86-64
will use unaligned vector store which isn't suitable
for all targets.
> I think it is fine to extract this into a seperate function, however it
> should
> include the memset so one can change the size of when to call memset -
> it is not obvious to me why the magic 9 is best.
>
Good point. Will add a macro to specify the unroll size.
Thanks.
> Cheers,
> Wilco
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://sourceware.org/pipermail/libc-alpha/attachments/20241128/e32d6a55/attachment.htm>
More information about the Libc-alpha
mailing list