This is the mail archive of the binutils@sources.redhat.com mailing list for the binutils project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]

ld-auto-import memory bug fixing


Hi,
currently I'm trying to fix the bug described in
http://sources.redhat.com/ml/binutils/2001-06/msg00742.html
With that I have a question:

When linking a dll with g++, it calls collect which calls ld, but ld is not in
the ps list.

Can anynone tell me how I can even though debug ld with gdb ?

Regards
Ralf

PS: For those, who doesn't know how the auto-import feature works, see below

/************************************************************************

 Auto-import feature by Paul Sokolovsky

 Quick facts:

 1. With this feature on, DLL clients can import variables from DLL
 without any concern from their side (for example, without any source
 code modifications).

 2. This is done completely in bounds of the PE specification (to be fair,
 there's a place where it pokes nose out of, but in practise it works).
 So, resulting module can be used with any other PE compiler/linker.

 3. Auto-import is fully compatible with standard import method and they
 can be mixed together.

 4. Overheads: space: 8 bytes per imported symbol, plus 20 for each
 reference to it; load time: negligible; virtual/physical memory: should be
 less than effect of DLL relocation, and I sincerely hope it doesn't affect
 DLL sharability (too much).

 Idea

 The obvious and only way to get rid of dllimport insanity is to make client
 access variable directly in the DLL, bypassing extra dereference. I.e.,
 whenever client contains someting like

 mov dll_var,%eax,

 address of dll_var in the command should be relocated to point into loaded
 DLL. The aim is to make OS loader do so, and than make ld help with that.
 Import section of PE made following way: there's a vector of structures
 each describing imports from particular DLL. Each such structure points
 to two other parellel vectors: one holding imported names, and one which
 will hold address of corresponding imported name. So, the solution is
 de-vectorize these structures, making import locations be sparse and
 pointing directly into code. Before continuing, it is worth a note that,
 while authors strives to make PE act ELF-like, there're some other people
 make ELF act PE-like: elfvector, ;-) .

 Implementation

 For each reference of data symbol to be imported from DLL (to set of which
 belong symbols with name <sym>, if __imp_<sym> is found in implib), the
 import fixup entry is generated. That entry is of type
 IMAGE_IMPORT_DESCRIPTOR and stored in .idata$3 subsection. Each
 fixup entry contains pointer to symbol's address within .text section
 (marked with __fuN_<sym> symbol, where N is integer), pointer to DLL name
 (so, DLL name is referenced by multiple entries), and pointer to symbol
 name thunk. Symbol name thunk is singleton vector (__nm_th_<symbol>)
 pointing to IMAGE_IMPORT_BY_NAME structure (__nm_<symbol>) directly
 containing imported name. Here comes that "om the edge" problem mentioned
 above: PE specification rambles that name vector (OriginalFirstThunk)
 should run in parallel with addresses vector (FirstThunk), i.e. that they
 should have same number of elements and terminated with zero. We violate
 this, since FirstThunk points directly into machine code. But in practise,
 OS loader implemented the sane way: it goes thru OriginalFirstThunk and
 puts addresses to FirstThunk, not something else. It once again should be
 noted that dll and symbol name structures are reused across fixup entries
 and should be there anyway to support standard import stuff, so sustained
 overhead is 20 bytes per reference. Other question is whether having several
 IMAGE_IMPORT_DESCRIPTORS for the same DLL is possible. Answer is yes, it is
 done even by native compiler/linker (libth32's functions are in fact reside
 in windows9x kernel32.dll, so if you use it, you have two
 IMAGE_IMPORT_DESCRIPTORS for kernel32.dll). Yet other question is whether
 referencing the same PE structures several times is valid. The answer is why
 not, prohibitting that (detecting violation) would require more work on
 behalf of loader than not doing it.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]