Serious bug in auto-import
Ralf Habacker
Ralf.Habacker@freenet.de
Sun Sep 9 11:30:00 GMT 2001
> Paul Sokolovsky wrote:
>
> > CW> Please take a look at the thread "[aida_s@mx12.freecom.ne.jp: A serious
> > CW> bug of "ld --enable-auto-import"]" starting here:
> >
> > CW> http://sources.redhat.com/ml/binutils/2001-08/msg00595.html
> >
> > CW> This is a confirmed bug. I've tried to analyze it but I can't really
> > CW> come up with a fix; I wonder if a fix is possible at all. If this
> > CW> problem can't be fixed, I'm going to have to recommend removing the
> > CW> auto-import stuff. (Yes, it's that serious.)
> >
> > CW> PLEASE take a look and at least let me know you've seen it; it was
> > CW> originally reported early Friday morning to the cygwin-apps list, and
> > CW> moved to the binutils list on Saturday.
> >
> > I'm sorry, I was on vacation abroad.
>
>
> Oh -- this bug cropped up at a "bad" time for you. :-) Thanks for
> responding.
>
>
> > Yes, the weird stuff. Of course, I though about the issue when
> > started to work on auto-import, but my reasoning was misgone along the
> > lines which were evident in this thread too: I confused *relocs* and
> > *import thunks*. The former support addends, the latter - no. Well, I
> > never bothered to check issue afterwards.
> >
> > How disabling such a misfeature can be? Well, my opinion is that
> > sample code far-fetched and presents, err, bad software design. The
> > only sane usage of accessing external array with constant indexes I
> > can imagine is something like
> >
> > extern double world_transform_matrix[3][3];
> > ...
> >
> > I doubt much code uses such a horror, so losing ability to auto-import
> > contant-offset array for me is not a big loss.
>
>
> Agree. (NOTE: following paragraph was written before I read the rest of
> your message. You've already implemented the following! :-)
>
> But, if code DOES use this (assuming we can't "fix" it) -- then the poor
> unsuspecting user should be warned "Attempt to access array via constant
> offset" or some such, with a pointer to some documentation somewhere
> that says "The following idiom is not supported in client code for array
> variables exported from DLLs on pei-386:
> "array_exported_from_dll[constant_index] = foo"
> or "foo = array_exported_from_dll[constant_index]"
> Please replace with the following idiom:
> <type> *local_copy = array_exported_from_dll;
> local_copy[constant_index] = foo;
> or foo = local_copy[constant_index];"
> (yeah, yeah, I know. pointers, arrays. the above is meant as a
> starting point; not to be taken literally)
>
> > Much more pitiful fact
> > is that it also applies to accessing global extern struct's - I bet
> > periodically code which does that will be hit.
>
>
> Oooh. I hadn't thought of that.
>
> > Well, nothing comes for
> > free.
> >
> > Actions to be taken: well, I don't think it's productive idea,
> > having hit some particular difficulty, give up entirely thing which
> > otherwise works well.
>
>
> True. I panicked. (At least it was private mail; I hadn't (and still
> have not) made that suggestion onlist.
>
> > Imho, the better idea is to make quick surgery
> > to localize the problem. That's exactly what attached patch does:
> >
> > ====
> > #include <stdio.h>
> > extern struct
> > {
> > int a;
> > int b;
> > } st;
> >
> > int main(void){
> > printf("%d\n", st.b);
> > printf("%d\n", st.a);
> > }
> > ====
> >
> > gcc -s -Wl,--enable-auto-import -o hello.exe hello.o -L. -lhwstr
> > Warning: resolving _st by linking to __imp__st (auto-import)
> > hello.o(.text+0x13):hello.c: aggregate 'st' is referenced in direct
> addressing m
> > ode with constant offset - auto-import failed. Workarounds: a) mark
> the symbol w
> > ith __declspec(dllimport); b) use auxiliary variable
> > make: *** [hello.exe] Hangup
> >
>
>
> AHA! Exactly what I suggested above -- but I didn't know how to make ld
> do that. Very cool. Good work, Paul. Warning when something fails and
> identifying the spot where the failure occurs is MUCH better than
> silently allowing buggy code. :-)
>
> I'm going to test this; assuming it works as advertised (and I have no
> reason to think it will not) I recommend immediate inclusion.
>
> THEN we can think about a "real" fix for this array-index problem (if
> possible; it may not be). :-)
>
> --Chuck
>
> P.S. Paul: also, please take a look here:
> ld-auto-import memory bug fixing
> http://sources.redhat.com/ml/binutils/2001-09/msg00062.html
>
> Ralf has been trying (without success) to track down a memory leak
> problem that makes it difficult (impossible?) to link KDE apps to KDE
> DLL's via auto-import. Too many dll's, too many symbols + memory leak
> == out of memory. The memory leak *may* be related to this snippet of
> code in pe-dll.c(pe_walk_relocs_of_symbol):
>
> + /* Warning: the allocated symbols are remembered in BFD and reused
> +
> later, so don't free them! */
> !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
> !!! Where do they be used later ??? Can this not be done in another way ?
> !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
> +
> /* free (symbols); */
>
> And here, in pe.em(pe_find_data_imports)
>
> + int nsyms, symsize, i;
> +
> + symsize = bfd_get_symtab_upper_bound (b);
> !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
> symbols will never be freed, only overwritten
> !!!!!
> + symbols = (asymbol **) xmalloc (symsize);
>
> Original bug description is here:
> WG: ld --auto-import for cygwin and libtool
> http://sources.redhat.com/ml/binutils/2001-06/msg00742.html
>
I have looked into the relating stuff and found, that for any section of any
input bfd a symbol tab is allocated.
For this I have two questions:
1. I recognized, that for any input file read by bfd one symboltab is present,
not one for any section. Isn't it ?
If this is true (and it seems so, because I have applied a patch and have
compiled about 20 dll's with it), I have
found a way to solve this problem. (moving symboltable allocating in the first
for loop)
(This patch contains some debugging stuff, which needs other files patch. It is
for dicsussion only)
pe-dll.c
for (b = info->input_bfds; b; b = b->link_next)
{
- arelent **relocs;
- int relsize, nrelocs, i;
+ int nsyms, symsize;
+ asymbol **symbols;
+ symsize = bfd_get_symtab_upper_bound (b);
+ symbols = (asymbol **) xmalloc (symsize);
+ nsyms = bfd_canonicalize_symtab (b, symbols);
+
for (s = b->sections; s; s = s->next)
{
- asymbol **symbols;
- int nsyms, symsize;
+ int relsize, nrelocs, i;
+ arelent **relocs;
int flags = bfd_get_section_flags (b, s);
/* Skip discarded linkonce sections */
if (flags & SEC_LINK_ONCE
- && s->output_section == bfd_abs_section_ptr)
+ && s->output_section == bfd_abs_section_ptr) {
+ if (pe_dll_extra_pe_debug & 4)
+ printf("\t section->name=%s -->
ignored\n",s->name);
continue;
-
+ }
current_sec=s;
- symsize = bfd_get_symtab_upper_bound (b);
- symbols = (asymbol **) xmalloc (symsize);
- nsyms = bfd_canonicalize_symtab (b, symbols);
-
relsize = bfd_get_reloc_upper_bound (b, s);
relocs = (arelent **) xmalloc ((size_t) relsize);
nrelocs = bfd_canonicalize_reloc (b, s, relocs, symbols);
2. Relating to this bug, I have found that allocating symbol tables is done
several times in ld. Using a symbol
cache would be save much memory.
pe_find_data_imports ()
ei386pe.c: symsize = bfd_get_symtab_upper_bound (b);
gld_i386pe_after_open ()
ei386pe.c: symsize = bfd_get_symtab_upper_bound (is->the_bfd);
pe_walk_relocs_of_symbol (info, name, cb)
pe-dll.c: symsize = bfd_get_symtab_upper_bound (b);
process_def_file (abfd, info)
pe-dll.c: symsize = bfd_get_symtab_upper_bound (b);
generate_reloc (abfd, info)
pe-dll.c: symsize = bfd_get_symtab_upper_bound (b);
ldcref.c: symsize = bfd_get_symtab_upper_bound (abfd);
ldmain.c: symsize = bfd_get_symtab_upper_bound (abfd);
ldmisc.c: symsize = bfd_get_symtab_upper_bound (abfd);
What about creating a symbol cache list in ld using pointer to loaded symbols
tables or by extending bfd structure with a pointer to a "symbol cache entry",
which can be filled by client code and later reused by other code.
Extending bfd would minimize the overhead to handle this cache, so I think this
would be the best way.
I have to say one limitation to this. Such cached symbol tables should not be
manipulated. They should be used only read only. Is this real ?
I you wish, I can look deeper into this.
Regards Ralf
More information about the Binutils
mailing list