Serious bug in auto-import

Ralf Habacker Ralf.Habacker@freenet.de
Sun Sep 9 11:30:00 GMT 2001


> Paul Sokolovsky wrote:
>
> > CW> Please take a look at the thread "[aida_s@mx12.freecom.ne.jp: A serious
> > CW> bug of "ld --enable-auto-import"]" starting here:
> >
> > CW> http://sources.redhat.com/ml/binutils/2001-08/msg00595.html
> >
> > CW> This is a confirmed bug.  I've tried to analyze it but I can't really
> > CW> come up with a fix; I wonder if a fix is possible at all.  If this
> > CW> problem can't be fixed, I'm going to have to recommend removing the
> > CW> auto-import stuff. (Yes, it's that serious.)
> >
> > CW> PLEASE take a look and at least let me know you've seen it; it was
> > CW> originally reported early Friday morning to the cygwin-apps list, and
> > CW> moved to the binutils list on Saturday.
> >
> >     I'm sorry, I was on vacation abroad.
>
>
> Oh -- this bug cropped up at a "bad" time for you. :-)   Thanks for
> responding.
>
>
> >     Yes, the weird stuff. Of course, I though about the issue when
> > started to work on auto-import, but my reasoning was misgone along the
> > lines which were evident in this thread too: I confused *relocs* and
> > *import thunks*. The former support addends, the latter - no. Well, I
> > never bothered to check issue afterwards.
> >
> >      How disabling such a misfeature can be? Well, my opinion is that
> > sample code far-fetched and presents, err, bad software design. The
> > only sane usage of accessing external array with constant indexes I
> > can imagine is something like
> >
> >           extern double world_transform_matrix[3][3];
> >           ...
> >
> >      I doubt much code uses such a horror, so losing ability to auto-import
> > contant-offset array for me is not a big loss.
>
>
> Agree.  (NOTE: following paragraph was written before I read the rest of
> your message.  You've already implemented the following! :-)
>
> But, if code DOES use this (assuming we can't "fix" it) -- then the poor
> unsuspecting user should be warned "Attempt to access array via constant
> offset" or some such, with a pointer to some documentation somewhere
> that says "The following idiom is not supported in client code for array
> variables exported from DLLs on pei-386:
>     "array_exported_from_dll[constant_index] = foo"
> or "foo = array_exported_from_dll[constant_index]"
> Please replace with the following idiom:
>     <type> *local_copy = array_exported_from_dll;
>     local_copy[constant_index] = foo;
> or foo = local_copy[constant_index];"
> (yeah, yeah, I know.  pointers, arrays.  the above is meant as a
> starting point; not to be taken literally)
>
> > Much more pitiful fact
> > is that it also applies to accessing global extern struct's - I bet
> > periodically code which does that will be hit.
>
>
> Oooh.  I hadn't thought of that.
>
> > Well, nothing comes for
> > free.
> >
> >      Actions to be taken: well, I don't think it's productive idea,
> > having hit some particular difficulty, give up entirely thing which
> > otherwise works well.
>
>
> True.  I panicked.  (At least it was private mail; I hadn't (and still
> have not)  made that suggestion onlist.
>
> > Imho, the better idea is to make quick surgery
> > to localize the problem. That's exactly what attached patch does:
> >
> > ====
> > #include <stdio.h>
> > extern struct
> > {
> >   int a;
> >   int b;
> > } st;
> >
> > int main(void){
> >         printf("%d\n", st.b);
> >         printf("%d\n", st.a);
> > }
> > ====
> >
> > gcc -s -Wl,--enable-auto-import -o hello.exe hello.o -L. -lhwstr
> > Warning: resolving _st by linking to __imp__st (auto-import)
> > hello.o(.text+0x13):hello.c: aggregate 'st' is referenced in direct
> addressing m
> > ode with constant offset - auto-import failed. Workarounds: a) mark
> the symbol w
> > ith __declspec(dllimport); b) use auxiliary variable
> > make: *** [hello.exe] Hangup
> >
>
>
> AHA!  Exactly what I suggested above -- but I didn't know how to make ld
> do that.  Very cool.  Good work, Paul.  Warning when something fails and
> identifying the spot where the failure occurs is MUCH better than
> silently allowing buggy code. :-)
>
> I'm going to test this; assuming it works as advertised (and I have no
> reason to think it will not) I recommend immediate inclusion.
>
> THEN we can think about a "real" fix for this array-index problem (if
> possible; it may not be). :-)
>
> --Chuck
>
> P.S. Paul: also, please take a look here:
> ld-auto-import memory bug fixing
> http://sources.redhat.com/ml/binutils/2001-09/msg00062.html
>
> Ralf has been trying (without success) to track down a memory leak
> problem that makes it difficult (impossible?) to link KDE apps to KDE
> DLL's via auto-import.  Too many dll's, too many symbols + memory leak
> == out of memory. The memory leak *may* be related to this snippet of
> code in pe-dll.c(pe_walk_relocs_of_symbol):
>
> + /* Warning: the allocated symbols are remembered in BFD and reused
> +
>       later, so don't free them! */
> !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
> !!! Where do they be used later ??? Can this not be done in another way ?
> !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
> +
>    /* free (symbols); */
>
> And here, in pe.em(pe_find_data_imports)
>
> +            int nsyms, symsize, i;
> +
> +            symsize = bfd_get_symtab_upper_bound (b);
> !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
> symbols will never be freed, only overwritten
> !!!!!
> +            symbols = (asymbol **) xmalloc (symsize);
>
> Original bug description is here:
> WG: ld --auto-import for cygwin and libtool
> http://sources.redhat.com/ml/binutils/2001-06/msg00742.html
>
I have looked into the relating stuff and found, that for any section of any
input bfd a symbol tab is allocated.
For this I have two questions:

1. I recognized, that for any input file read by bfd one symboltab is present,
not one for any section. Isn't it 	?
If this is true (and it seems so, because I have applied a patch and have
compiled about 20 dll's with it), I have
found a way to solve this problem.  (moving symboltable allocating in the first
for loop)

(This patch contains some debugging stuff, which needs other files patch. It is
for dicsussion only)

pe-dll.c
   for (b = info->input_bfds; b; b = b->link_next)
     {
-      arelent **relocs;
-      int relsize, nrelocs, i;
+         int nsyms, symsize;
+         asymbol **symbols;

+         symsize = bfd_get_symtab_upper_bound (b);
+         symbols = (asymbol **) xmalloc (symsize);
+         nsyms = bfd_canonicalize_symtab (b, symbols);
+
       for (s = b->sections; s; s = s->next)
        {
-         asymbol **symbols;
-         int nsyms, symsize;
+      int relsize, nrelocs, i;
+           arelent **relocs;
          int flags = bfd_get_section_flags (b, s);

          /* Skip discarded linkonce sections */
          if (flags & SEC_LINK_ONCE
-             && s->output_section == bfd_abs_section_ptr)
+                     && s->output_section == bfd_abs_section_ptr) {
+                               if (pe_dll_extra_pe_debug & 4)
+                                       printf("\t section->name=%s -->
ignored\n",s->name);
            continue;
-
+                       }
          current_sec=s;

-         symsize = bfd_get_symtab_upper_bound (b);
-         symbols = (asymbol **) xmalloc (symsize);
-         nsyms = bfd_canonicalize_symtab (b, symbols);
-
          relsize = bfd_get_reloc_upper_bound (b, s);
          relocs = (arelent **) xmalloc ((size_t) relsize);
          nrelocs = bfd_canonicalize_reloc (b, s, relocs, symbols);

2. Relating to this bug, I have found that allocating symbol tables is done
several times in ld. Using a symbol
   cache would be save much memory.

			pe_find_data_imports ()
	ei386pe.c:         symsize = bfd_get_symtab_upper_bound (b);

			gld_i386pe_after_open ()
	ei386pe.c:         symsize = bfd_get_symtab_upper_bound (is->the_bfd);

			pe_walk_relocs_of_symbol (info, name, cb)
	pe-dll.c:         symsize = bfd_get_symtab_upper_bound (b);

			process_def_file (abfd, info)
	pe-dll.c:         symsize = bfd_get_symtab_upper_bound (b);

			generate_reloc (abfd, info)
	pe-dll.c:         symsize = bfd_get_symtab_upper_bound (b);

	ldcref.c:      symsize = bfd_get_symtab_upper_bound (abfd);
	ldmain.c:         symsize = bfd_get_symtab_upper_bound (abfd);
	ldmisc.c:                   symsize = bfd_get_symtab_upper_bound (abfd);

What about creating a symbol cache list in ld using pointer to loaded symbols
tables or by extending bfd structure with a pointer to a "symbol cache entry",
which can be filled by client code and later reused by other code.
Extending bfd would minimize the overhead to handle this cache, so I think this
would be the best way.
I have to say one limitation to this. Such cached symbol tables should not be
manipulated. They should be used only read only. Is this real ?

I you wish, I can look deeper into this.

Regards Ralf



More information about the Binutils mailing list