More (?) steps toward jemalloc within Cygwin DLL

Corinna Vinschen
Fri Jul 3 10:11:15 GMT 2020

Hi Mark,

On Jul  2 23:57, Mark Geisert wrote:
> Hi Corinna,
> Corinna Vinschen wrote:
> > On Jun 16 02:16, Mark Geisert wrote:
> > > I'm just putting a flag down on this new (to me) territory.  If somebody
> > > else has claimed this project already, let me know and I'll shove off.
> > 
> > No, please.  Just keep on working on that.  If you manage to get jemalloc
> > working and replacing dlmalloc, this would be really great.
> Super.
> > > It wasn't much trouble to build a jemalloc.lib and statically link it into
> > > the Cygwin DLL when the latter is built.  I'm still learning which jemalloc
> > > configure options are required in order to get complete test coverage and to
> > > initialize properly within cygwin1.dll.
> > > 
> > > I'm currently using the "supply your own malloc" mechanism provided by
> > > Cygwin's to overlay the usual dlmalloc-sourced functions
> > > with replacements from jemalloc.  I suspect there will be allocation
> > > collisions ahead...
> I've had to rethink the above a bit.
> > The real problem here is this:
> > 
> >    __malloc_lock ();
> >    dl_foo_function ();
> >    __malloc_unlock ();
> > 
> > This locking is what makes our dlmalloc even slower in multi-threaded
> > scenarios because it disallows using malloc/free calls concurrently.
> > 
> > If you get jemalloc working, it would be nice in itself, but the main
> > improvement would be the ability to get rid of these __malloc_lock/
> > __malloc_unlock brackets.
> Thanks for reminding me of that aspect of Cygwin's current malloc.  The
> malloc implementation has seemed to be bulletproof for many years so I guess
> the function-level locking is the only drawback of note?

Not quite.  It's bad enough, given how much this slows down multi-threaded
executables, but...

...the big problem are dependencies on malloc during Cygwin startup,
especially in fork/exec, so the real challenge is to get the new malloc
still let Cygwin processes start up correctly first time and especially
in fork/exec situations, and to make sure the malloc bookkeeping
survives fork/exec.

These malloc dependencies sometimes crop up in the weirdest situations,
so that's something to look out for.  For instance, using pthread
functions may call malloc as well.  If a problem can be solved by
changing another part of Cygwin, don't hesitate to discuss this!

> I've found that jemalloc would add 500kB to cygwin1.dll and it also seems
> difficult to get working, at first blush at least.

OTOH you leave dlmalloc behind, so that's 280kB less again.

> I've switched to a
> plug-in sort of implementation that allows one to choose among several
> malloc packages: "original", dlmalloc (w/ internal locking), ptmalloc[23],
> nedalloc, jemalloc, and a Windows Heap wrapper.  Perhaps tcmalloc in the
> future.  One sets an environment variable CYGMALLOC=<name> before launching
> a program and that malloc implementation is used.  This should make testing
> and benchmarking the various choices possible.  I don't expect big
> improvements in individual programs (unless they are stress testing), but
> something like a large configure or build should give more useful data.

In the end, we should settle for a single malloc implementation, though.
It doesn't really matter if it's jemalloc, ptmalloc, xymalloc.  Almost
all other modern mallocs are faster and better suited for multi-threading
than dlmalloc, *especially* if the above locks can go away.

The only danger here is this: If you manage to get dlmalloc replaced
reliably, you *will* get a pink plush hippo!


Corinna Vinschen
Cygwin Maintainer

More information about the Cygwin-developers mailing list