DLL loading

Charles Wilson cygwin@cwilson.fastmail.fm
Tue May 24 15:47:00 GMT 2011

Corinna Vinschen wrote:
> Yes, we can't use a native fork mechanism, so we have to find another
> way.  Three ideas come to mind:
> - A more intelligent algorithm in rebase/rebaseall to place the various
>   Cygwin distro DLLs so that they don't collide, perhaps together with a
>   postinstall script which rebases automatically.  This is a short-term
>   way to deal with the problem.
> - Figure out if and how we can hook the Windows loader so that rebasing
>   a DLL on the fly at load time can be influenced in terms of the start
>   address.
> - Stop linking against Cygwin DLLs other than the Cygwin DLL itself.
>   Instead, provide our own loader.
> Does anybody feel an affinity to have a look into one of the above?
> Or, does anybody have another idea how to ease the pain?

I've always kinda thought that cygwin2.dll should use something like
for cygwin DLLs.  Of course, the entire toolchain would still need to
support "real" DLLs/import libs in order to access the w32api DLLs,
unless we also used edll/flexdll-ish "implibs" for those, too. (No way
around cygwin2.dll itself linking directly to the w32api DLLs tho,
unless we used its existing autoload framework for almost all of its own
internal needs, but...slow!).

Note that I said "cygwin2.dll".  I don't think #3 is suitable for
cygwin1.dll, because while it could probably be backwards compatible
with existing progs...ugly.

So, IMO the only realistic options for cygwin1.dll are #1 and #2 -- if
#2 is even possible.

The problem I see is that none of these ideas, except MAYBE #3 IF we
also force cygwin to use autoloadish behavior internally almost
exclusively, will deal with the problem of the w32 DLLs ending up in the
correct place(s).

I think Ryan's conspiracy theory might not be far off the mark.

Now, for constructive ideas -- here's my brainstorm.  Maybe it's
workable, maybe not, I dunno:

Most of the problems appear to occur with apps that have a lot of
DLL-based plugin/extension libraries -- perl, python, octave.

What if we were to create a post-processing tool that package
maintainers could use, that could:
#1) merge a "new" DLL into a single common DLL
#2) modify (or recreate) the "new" DLL's implib to point at the common DLL.

Then, the build process for, say, perl, could be modified to;
1) build the extension DLL and its implib as normal
2) merge the new DLL into the "aggregate" one
3) rewrite/recreate the extension's implib to point to the aggregate
...rinse and repeat as needed.

That way, even if perl THINKS it needs 52 different DLLs...it only gets
loaded once, and each repeated load (of the same aggregate) simply
returns a new handle to the original DLL.  Since it only gets loaded
once, and it just occupies a single -- large -- block of memory instead
of 52 different smaller ones, that's 51 times fewer opportunities for an
address clash.

If possible, I think this would go a long way towards mitigating the

There are two issues, tho:

One issue I see is with dlopen/dlclose.  If "perl" dlopen's "Cwd.dll" --
well, there is no Cwd.dll, because it's been merged into
AggregatePerlExtension.dll.  So, somehow cygwin's dlopen impl would need
to know about this -- maybe we'd have a FILE named Cwd.dll but it would
actually just be a data file telling cygwin's dlopen implementation the
name of the real DLL (or 'Cwd.dll.redir' and dlopen would know about the
naming convention?).  Then dlopen would have to keep track of how may
such redirected references to AggregatePerlExtension.dll there are, and
only allow APE.dll to be REALLY unloaded if ALL of the various aliases
have been dlclosed.

The second issue is that often, these 'extension' dlls have exactly the
same interface: they define the same functions: 'ExtensionInit()'
'ExtensionDestroy', 'ExtensionExecute(void * opaque_data), etc.  So, how
do you aggregate them into a single DLL?  You'd have to munge all the
names, and then the 'rewritten' implibs for the 'individual' DLLS would
be a bit different (basically redirects?).  Also, dlsym() would have to
know about the synbol name munging, and rewrite that too... But then,
what if internally in DLL_A, FnB calls FnC?  If you rename FnB to
DLL_A_FnB, and FnC to DLL_A_FnC, did you break FnB's call to FnC?
Normally no, because those are resolved directly when the DLL is linked.
 However, Bruno's approach here:
would have trouble with symbol renaming, I think???, unless the
DLL-merging process can somehow "relink" the original DLL's internal



More information about the Cygwin-developers mailing list