This is the mail archive of the
mailing list for the glibc project.
Re: malloc - cache locality - TLB
- From: OndÅej BÃlka <neleai at seznam dot cz>
- To: Rich Felker <dalias at aerifal dot cx>
- Cc: Torvald Riegel <triegel at redhat dot com>, Carlos O'Donell <carlos at redhat dot com>, GNU C Library <libc-alpha at sourceware dot org>, Roland McGrath <roland at hack dot frob dot com>, Andreas Jaeger <aj at suse dot com>, "Joseph S. Myers" <joseph at codesourcery dot com>, Andreas Schwab <schwab at suse dot de>, Siddhesh Poyarekar <siddhesh at redhat dot com>
- Date: Fri, 20 Dec 2013 20:00:51 +0100
- Subject: Re: malloc - cache locality - TLB
- Authentication-results: sourceware.org; auth=none
- References: <52A6A0DA dot 1080109 at redhat dot com> <1386688619 dot 23049 dot 3215 dot camel at triegel dot csb> <20131220022411 dot GA26981 at domone dot podge> <20131220160915 dot GA7826 at domone dot podge> <20131220162116 dot GP24286 at brightrain dot aerifal dot cx>
On Fri, Dec 20, 2013 at 11:21:16AM -0500, Rich Felker wrote:
> On Fri, Dec 20, 2013 at 05:09:15PM +0100, OndÅej BÃlka wrote:
> > As linux supported since 2003 huge pages we could try to use these.
> Transparentt huge pages are the only sane way to use them, and they're
> already supported for huge malloc calls serviced by mmap.
This needs to invoke libhugetlbfs which is extra dependency. A
transparency is not wanted as we need to distinguish cases where these
pages will be used versus where not.
> The kernel
> could also opt to use huge pages for heap ranges consisting of many
> independent allocations, but this would probably be a pessimization in
> the general case because it precludes fine-grained swapping of unused
> data and like you mentioned has ridiculous overhead.
Swapping is secondary concern, you cannot save more than is in huge
pages, data there tends to get accessed randomly in unpredictable
intervals which could cause dos by excessive disk seeks.
There is no overhead when all pages would eventualy get used.
In short-lived applications these could improve perfornmace by avoiding
minor faults, adding MAP_HUGETLB in following example improves running
time by around 10%.
char *x = mmap (NULL, 30 * 1 << 21, PROT_READ | PROT_WRITE,
MAP_PRIVATE | MAP_HUGETLB | MAP_ANONYMOUS, -1 ,0);
for (i=0; i < 30 * (1 << 21); i += 4096)
x[i] = 42;
If there were 128k pages, they would be more useful as these decrease
overhead nearly as well as huge ones and have wider application range.