More (?) steps toward jemalloc within Cygwin DLL

Mark Geisert
Tue Jul 21 08:50:55 GMT 2020

Corinna Vinschen wrote:
>>> If you get jemalloc working, it would be nice in itself, but the main
>>> improvement would be the ability to get rid of these __malloc_lock/
>>> __malloc_unlock brackets.
>> Thanks for reminding me of that aspect of Cygwin's current malloc.  The
>> malloc implementation has seemed to be bulletproof for many years so I guess
>> the function-level locking is the only drawback of note?
> Not quite.  It's bad enough, given how much this slows down multi-threaded
> executables, but...
> ...the big problem are dependencies on malloc during Cygwin startup,
> especially in fork/exec, so the real challenge is to get the new malloc
> still let Cygwin processes start up correctly first time and especially
> in fork/exec situations, and to make sure the malloc bookkeeping
> survives fork/exec.

O.K., understood.

> These malloc dependencies sometimes crop up in the weirdest situations,
> so that's something to look out for.  For instance, using pthread
> functions may call malloc as well.  If a problem can be solved by
> changing another part of Cygwin, don't hesitate to discuss this!

Yes, a couple of the malloc packages I'm testing want to allocate locks and TLS 
slots right off the bat so there's nasty recursion possible.

>> I've switched to a
>> plug-in sort of implementation that allows one to choose among several
>> malloc packages: "original", dlmalloc (w/ internal locking), ptmalloc[23],
>> nedalloc, jemalloc, and a Windows Heap wrapper.  Perhaps tcmalloc in the
>> future.  One sets an environment variable CYGMALLOC=<name> before launching
>> a program and that malloc implementation is used.  This should make testing
>> and benchmarking the various choices possible.  I don't expect big
>> improvements in individual programs (unless they are stress testing), but
>> something like a large configure or build should give more useful data.
> In the end, we should settle for a single malloc implementation, though.
> It doesn't really matter if it's jemalloc, ptmalloc, xymalloc.  Almost
> all other modern mallocs are faster and better suited for multi-threading
> than dlmalloc, *especially* if the above locks can go away.

For sure; I didn't make it clear this CYGMALLOC setup is just for testing the 
different malloc packages.  When I stumble across some failing in one of them 
it's nice to be able to quickly re-run using a different malloc.

Here's a question I didn't expect to come up: If it turns out a home-grown 
wrapper on the Win32 HeapXXX functions performs better (hint: it does, 2.5 to 3 
times better) than any malloc package derived from dlmalloc, is there any reason 
why we ought not use it?  Assuming it can be made to work for all those cases 
you mentioned above, of course.

> The only danger here is this: If you manage to get dlmalloc replaced
> reliably, you *will* get a pink plush hippo!

Oh, gee, that sounds like a really nice reward... Wow, I'm gonna have to do this 
project now for sure!


More information about the Cygwin-developers mailing list