This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [RFC PATCH]: Align large allocations to cacheline


On Wed, Oct 29, 2014 at 03:31:37PM -0000, Wilco Dijkstra wrote:
> > Rich Felker wrote: 
> > On Wed, Oct 29, 2014 at 01:43:54PM -0000, Wilco Dijkstra wrote:
> > > This patch aligns allocations of large blocks to a cacheline on ARM and AArch64. The main
> > goal is to
> > > reduce performance variations due to random alignment choices, however it improves
> > performance on
> > > several benchmarks as well. SPECFP2000 improves by ~1.5%.
> > >
> > > Any comments?
> > 
> > It seems like this would pathologically increase fragmentation: when a
> > tiny block is split off the beginning to align a large allocation,
> > isn't it likely that this tiny chunk (rather than some other tiny
> > chunk already in the appropriate-sized free bin prior to the large
> > allocation) will be used to satisfy a tiny request, which may end up
> > being long-lived?
> 
> A good implementation of memalign would try to avoid that, for example
> by increasing the previous allocated block or not adding tiny splitoff
> chunks to the free list.

I don't think modifying the previous allocated block is practical;
this almost certainly has data races unless you impose costly
synchronization.

If you do the split but don't add the split block to the free list, I
don't see how you'd track it to free it later.

You could refrain from splitting the tiny block at all and instead
keep it as part of the the allocated block with a special flag bit on
the header to indicate that the allocator needs to look back to find
the actual beginning of the block when freeing or reallocing it, etc.
This is essentially a more complex, but slightly more optimal, version
of my idea for rounding up the size.

> > memalign-type operations are known to be a major problem for
> > fragmentation. See this bug report:
> > 
> > https://sourceware.org/bugzilla/show_bug.cgi?id=14581
> 
> It seems that report is claiming that freeing a block doesn't immediately
> merge with adjacent freed blocks. If that is the case then that is a very
> serious bug indeed! The fact that memalign is not fully integrated into 
> the allocation code doesn't help either - it never tries to reuse an
> correctly aligned free block.

Yes, there are probably improvements that could be made to memalign to
reduce the problem, but they have tradeoffs. I don't think it's
practical to make memalign reuse correctly aligned free blocks unless
you segregate the free lists not just by size but also by alignment.
Otherwise you'd have to do a non-constant-time search to determine
whether a free block of the desired alignment exists and select it.

> > A smarter approach might be to round up the _sizes_ of large
> > allocations so that the next available address after the allocated
> > block is nicely aligned. This will tend to make _future_ allocations
> > aligned, even if the first one isn't, and won't leave any free gaps
> > that could contribute to fragmentation.
> 
> That might be possible too, I'd have to give it a go. I was trying to avoid
> reworking the guts of the malloc code - I think it would be better to create
> a new modern allocator from scratch.

That's quite possible. All the basic algorithms involved are
understood and well documented, so it's certainly practical to redo
without all the cruft. However there is quite a big testing burden.
Even if you know what you're doing, it's very easy to make stupid
mistakes.

Rich


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]