This is the mail archive of the
mailing list for the glibc project.
Re: calloc() implementation question
- From: OndÅej BÃlka <neleai at seznam dot cz>
- To: Carlos O'Donell <carlos at redhat dot com>
- Cc: Olivier Langlois <olivier at trillion01 dot com>, libc-alpha at sourceware dot org
- Date: Thu, 12 Dec 2013 19:41:53 +0100
- Subject: Re: calloc() implementation question
- Authentication-results: sourceware.org; auth=none
- References: <1386830712 dot 753 dot 25 dot camel at Wailaba2> <52A9F44F dot 8090708 at redhat dot com>
On Thu, Dec 12, 2013 at 12:37:19PM -0500, Carlos O'Donell wrote:
> On 12/12/2013 01:45 AM, Olivier Langlois wrote:
> > Hi,
> > Just by inspecting the code, there is something that I do not find
> > obvious.
> > nclears variable values distribution must have been analyzed and
> > everything must have been carefully balanced
> > if nclears is often <= 9 is it better to evaluate nclears up to 4 times
> > rather than unconditionally use memset() for any value of nclears?
> I don't have an answer for that.
> The workloads under which malloc was tuned were never documented.
> We have a plan to start doing that documentation such that we can
> answer your question.
> At the moment I don't think we have a good answer.
In malloc this make relative sense from risk-benefit perspective. There
may be workloads where first case happens and workloads where second
It is common that allocations are made in burst so code in question tends to
When there are lot of small requests then you could get good speedup
from this optimization (you get several cycles by being special case
where you zero string, size is nonzero and multiple of 16, and several
cycles by being able to ooo instructions more effectively.
Otherwise you are dealing with allocations of 160+ bytes, filling and
accessing these is slower than small path, losses are relatively small.
To see how big is speedup/loss write a benchmark that compares variant
with memset and one with inline expansion.
calloc2 (size_t n)
return memset (malloc (n), 0, n);
calloc3 (size_t n)
void *x = malloc (n);
if (n < 9 * 16)