[Patch] skipping import libraries for performance reasons - direct auto-import of dll's
Charles Wilson
cwilson@ece.gatech.edu
Thu Nov 28 13:24:00 GMT 2002
Okay, I've built and tested with this patch, and looked at the code a
little more closely than previously. I've attached a modified version
of Ralf's patch [but don't commit my version; wait for Ralf to regen].
The modified patch fixes a number of compiler warnings (use "%lx" not
"%x" for long int variables; avoid use of uninitialized variables, etc).
I've also fixed up the formatting, and added a few changes to correct
a misfeature that I discovered.
---------------------------------------------------------------
I've initialized the [data|bss]_[start|end] variables as:
/* Initialization with start > end guarantees that is_data will not be
set by mistake, and avoids compiler warning */
unsigned long data_start = 1;
unsigned long data_end = 0;
unsigned long bss_start = 1;
unsigned long bss_end = 0;
so that the statement below doesn't cause a compiler warning...or a run
time error. It is possible (I think) for a DLL to not have a .bss
section at all, in which case bss_[start|end] never get initialized
without the change above.
is_data = (func_rva >= data_start && func_rva < data_end )
|| (func_rva >= bss_start && func_rva < bss_end);
---------------------------------------------------------------
At one point, Ralf uses the following
/* skip unwanted symbols, which are exported in buggy auto-import
releases */
if (strstr(erva + name_rva,"_nm_") == 0)
What's the real purpose of this? It disallows my_symbol_nm_foo, as well
as _nm_foo or _imp_nm_foo or whatever it is that you're trying to screen
out. Would it be better to use something like this, instead:
if (strncmp(erva+name_rva,"_nm_",4) != 0)
which would screen out only those symbols that *begin* with _nm_? My
modified patch does NOT make this change, but I wonder if it should.
---------------------------------------------------------------
For the most part, it works as advertised. I did run in to one problem
though. If I create a file structure like this:
/usr/local/bin/cygfoo.dll
/usr/local/lib/libfoo.dll.a -> /usr/local/bin/cygfoo.dll
Which seems like a logical thing to do, given that we're using the DLL
to "substitute" for a true import lib. This way, you can do
gcc -o bar.exe bar.o -L/usr/local/lib -lfoo
and ld will use the symlink libfoo.dll.a to satisfy the dependency.
Unfortunately, this doesn't work, because ld doesn't realize that
"libfoo.dll.a" is actually a (symlink to) a DLL, and the
pe_implied_import_dll routine is never called.
I know there are OTHER ways to set up the filesystem so that the gcc
command above will work, such as:
/usr/local/bin/cygfoo.dll
/usr/local/lib/libfoo.dll -> /usr/local/bin/cygfoo.dll
or even
/usr/local/bin/cygfoo.dl
/usr/local/lib/cygfoo.dll -> /usr/local/bin/cygfoo.dll
But my point is that the original filesystem setup *should* work but
does not. The problem is in emultempl/pe.em (line 1395):
if (bfd_get_format (entry->the_bfd) == bfd_object)
{
const char *ext = entry->filename + strlen (entry->filename) - 4;
if (strcmp (ext, ".dll") == 0 || strcmp (ext, ".DLL") == 0)
return pe_implied_import_dll (entry->filename);
}
#endif
return false;
}
As you can see, pe_implied_import_dll is only called if the filename
ends in .dll or .DLL. We know that the DLL itself must have a name that
ends in .dll(.DLL), but the linker ought to be able to recognize a
symlink-to-a-dll as well(*). The stuff above should be replaced by
something like the following:
if (bfd_get_format (entry->the_bfd) == bfd_object)
{
char fbuf[PATH_MAX];
const char *ext;
if (realpath(entry->filename,fbuf) == NULL)
strncpy(fbuf,entry->filename,PATH_MAX);
ext = fbuf + strlen (fbuf) - 4;
if (strcmp (ext, ".dll") == 0 || strcmp (ext, ".DLL") == 0)
return pe_implied_import_dll (entry->filename);
}
#endif
return false;
}
Only problem: there's no guarantee that realpath or PATH_MAX is
available, so we need to jump thru some hoops to define LD_PATHMAX to
PATH_MAX or MAXPATHLEN or whatever, depending on what headers are
available...
So, we have to play games in ld/sysdep.h, and modify configure.in (and
run autoconf and autoheader) ...but once that's done, the
/usr/local/lib/libfoo.dll.a -> /usr/local/bin/cygfoo.dll scenario works.
- - - - - - - - - - - - - - - - - - - -
(*) symlink-to-a-dll would be INVALID without this change (already in
Ralf's patch):
+ /* use internal dll name instead of filename
+ to enable symbolic dll linking */
+ dll_name = pe_as32 (expdata + 12) + erva ;
Without it, the symlink's name would get embedded into the target as a
dependency -- and the Windows Runtime Loader would get really confused
since it doesn't understant symlinks, and only loads files that DO end
in .dll. So that's why this "problem" never came up before; it's only
worth consideration given Ralf's change...but Ralf's change should be
accompanied by the configure.in/config.in changes.
- - - - - - - - - - - - - - - - - - - -
---------------------------------------------------------------
I've split the patch into two pieces:
ld-auto-import-dll.patch-csw
the main changes
ld-auto-import-dll.patch-csw2.gz
the configure and config.in changes created by running
autoconf and autoheader.
Any comments on the revised patch? Is there a better way to handle the
realpath()/REALPATH() thing?
2002-11-28 Ralf Habacker <Ralf.Habacker@freenet.de>
Charles Wilson <cwilson@ece.gatech.edu>
* ld/config.in: regenerate
* ld/configure: regenerate
* ld/configure.in: add check for realpath function
* ld/deffile.h: add .data field to def_file_import
structure
* ld/pe-dll.c (pe_proces_import_defs): use .data
field of def_file_import structure to initialize
flag_data field of def_file_export structure
(pe_implied_import_dll): new variables exp_funcbase
and [data|bss]_[start|end]. Use DLL's internal name
to set dll_name, not filename (which may be a symlink).
Scan the sections and initialize [data|bss]_[start|end].
When scanning the export table, skip _nm_ symbols, and
mark any symbols whose rva indicates that it is in the
.bss or .data sections as data.
* ld/sysdep.h: include limits.h and sys/param.h, and
define LD_PATHMAX as appropriate. Also define REALPATH
as realpath if it exists, NULL otherwise
* ld/emultempl/pe.em (gld_${EMULATION_NAME}_after_open):
call pe_process_import_defs before pe_find_data_imports,
so that auto-import will check the virtual implib as well
as "real" implibs.
(gld_${EMULATION_NAME}_recognized_file): use REALPATH to
follow symlinks to their target; check that the target's
extension is .dll before calling pe_implied_import_dll(),
not the filename itself (which may be a symlink).
--Chuck
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: ld-auto-import-dll.patch-csw
URL: <https://sourceware.org/pipermail/binutils/attachments/20021128/84ebb047/attachment.ksh>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: ld-auto-import-dll.patch-csw2.gz
Type: application/x-gzip
Size: 4622 bytes
Desc: not available
URL: <https://sourceware.org/pipermail/binutils/attachments/20021128/84ebb047/attachment.bin>
More information about the Binutils
mailing list