About the dll search algorithm of dlopen

Corinna Vinschen corinna-cygwin@cygwin.com
Wed Jun 1 20:18:00 GMT 2016


On Jun  1 16:26, Michael Haubenwallner wrote:
> On 06/01/2016 01:09 PM, Corinna Vinschen wrote:
> > On Jun  1 08:40, Michael Haubenwallner wrote:
> >> Hi,
> >>
> >> two issues with dlopen here (I'm about to prepare patches):
> >>
> >> *) The algorithm to combine dll file name variants with the search path
> >>    entries needs to be reordered, as in:
> >>    - for each dll file name variant:
> >>    -   for each search path:
> >>    + for each search path entry:
> >>    +   for each dll file name variant:
> >>          check if useable
> > 
> > Rationale?  We only need to find one version of the file and there
> > usually only is one.  This is mainly for moduled systems like perl,
> > python, etc.
> 
> While I indeed didn't face a problem here yet, the rationale behind is
> that I need to install a large application that provides its own (portable)
> package build & management system, including lots of packages probably installed
> in the host (Cygwin) system as well, most likely in (slightly) different versions.
> 
> When these package management system now does provide a different dll naming
> scheme than the host system, like stick to "liblib.dll" rather than "cyglib.dll",
> and the application wants to dlopen its own "liblib.dll", currently the host's
> "cyglib.dll" is loaded.

Sounds a teeny bit artificial, but basically that's possible, yes.

> > However, if you can speed up the search process ignore the
> > question...
> 
> This also depends on whether find_exec really is necessary here.

Not as such necessary, it's just the function used to search in a
search path.  If you want to change that you have to rewrite the
same logic again, just reversed.

One way around YA code duplication could be some kind of path iterator
class which could be used from find_exec as well as from
get_full_path_of_dll.

> >> *) The directory of the current main executable should be searched
> >>    after LD_LIBRARY_PATH and before /usr/bin:/usr/lib.
> >>    And PATH should be searched before /usr/bin:/usr/lib as well.
> > 
> > Checking the executable path and $PATH are Windows concepts.  dlopen
> > doesn't do that on POSIX systems and we're not doing that either.
> 
> Agreed, but POSIX also does have the concept of embedded RUNPATH,
> which is completely missing in Cygwin as far as I can see.

RPATH and RUNPATH are ELF dynamic loader features, not supported by
PE/COFF.

> However, there is one path name that can easily serve as minimal
> "embedded RUNPATH" - the executable's directory.
> 
> This is where I do have a problem right now:
> 
> My own /application/bin/python2.7.exe is linked to libpython2.7.dll,
> located in /application/bin. Now there is some python script that does
> have some - strange enough - cygwin-conditional code that reads:
> 
>   import _ctypes
>   _ctypes.dlopen("libpython%d.%d.dll" % sys.version_info[:2])
> 
> While this is questionable by itself, it really shouldn't load another
> libpython2.7.dll than /application/bin/python2.7.exe has already loaded
> just because dlopen using a different search algorithm than CreateProcess().
> 
> However, when dlopen is about to search some path list, I can imagine to
> search the list of already loaded dlls first as well, but I'd prefer to
> leave this up to LoadLibrary...

This problem would only occur if dlopen is not called with a path.  If
the given pathname is a plain filename, we could simply add a call to
GetModuleHandle and if it succeeds, return the handle (after checking
for RTLD_NODELETE).

> > Having said that, LoadLibrary will search the usual paths.  After 2.5.2,
> > we're leaving XP/2003 behind, and then we probably should tighten the
> > search algorithm along the lines of
> > 
> >   AddDllDirectory ("/usr/bin");
> >   AddDllDirectory ("/usr/lib");
> >   [...]
> >   LoadLibraryEx (path, NULL, LOAD_LIBRARY_SEARCH_USER_DIRS
> > 			     | LOAD_LIBRARY_SEARCH_SYSTEM32);
> 
> /me fails to see how this does help with the missing embedded RUNPATH.

It doesn't.  It just tightens the search path to not load from the cwd
or the application path.  If you want that, add it to LD_LIBRARY_PATH
explicitely.

> >>    For consistency, IMO, when any searched path ends in either
> >>    x/bin or x/lib, we should search x/bin:x/lib.
> > 
> > This might make sense, at least in the direction lib->bin.
> 
> Fine with me too.
> 
> Side note:
> We also use some cl.exe/link.exe wrapper that supports LD_PRELOAD,
> LD_LIBRARY_PATH, embedded RUNPATH, as well as lazy loading for both
> LoadLibrary and CreateProcess: https://github.com/haubi/parity
> Basically I'm wondering why Cygwin doesn't provide that (yet?)...

We discussed implementing our own dynamic loader once, but gave up due
to workload.  Parity is LGPLed and thus can't be included into Cygwin
itself.

DT_RUNTIME should be possible with not too much effort, but would
require gcc/binutils support.  Some tricking with a symbol in the linker
script, setting a pointer to it in _cygwin_crt0_common, and a matching
call to AddDllDirectory comes to mind...

LD_PRELOAD is (kind of) implemented but I think doesn't work as
intended.  Importing symbols is bound to the name of the DLL they came
from in a PE/COFF file.  To implement the full set of ELF dynloader
features would require some major effort, like what parity does.


Corinna

-- 
Corinna Vinschen                  Please, send mails regarding Cygwin to
Cygwin Maintainer                 cygwin AT cygwin DOT com
Red Hat
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 819 bytes
Desc: not available
URL: <http://cygwin.com/pipermail/cygwin-developers/attachments/20160601/ff2878f9/attachment.sig>


More information about the Cygwin-developers mailing list