This problems occurs on targets using an a.out format by default, while having support for ELF format. Such targets include i386-pc-netbsdaout and m68k-netbsdaout. Basically, in certain conditions, a "stab warning" in an object file cause ld to discard all the symbols defined in that object. When there is no stab warning, the testcases described here works fine. $ cat a.s .extern _f jmp _f $ cat b.s .stabs "a linker warning",30,0,0,0 .stabs "_f",1,0,0,0 .globl _f _f: ret $ as a.s -o a.o # file format a.out-i386-netbsd $ as b.s -o b.o # file format a.out-i386-netbsd $ ld a.o b.o -o ab # file format a.out-i386-netbsd a.o:a.o:(.text+0x1): warning: a linker warning We can see here that everything uses the a.out file format, and everything works as expected. The problem occurs if a.o is converted to ELF. $ objcopy a.o a-elf.o -O elf32-i386 BFD: a-elf.o: warning: allocated section `.text' not in segment $ ld a-elf.o b.o -o ab a-elf.o:(.text+0x1): warning: a linker warning a-elf.o:(.text+0x1): undefined reference to `_f' The link fails because the symbol _f is not found, however it is still in the b.o file. $ binutils-2.18-netbsd.obj/binutils/objdump -t b.o b.o: file format a.out-i386-netbsd SYMBOL TABLE: 00000000 W d *ABS* 0000 00 1e a linker warning 00000000 g *ABS* 0000 00 03 _f 00000000 g .text 0000 00 05 _f Further strangeness comes if we convert a.o into ELF, too. $ objcopy a.o a-elf.o -O elf32-i386 BFD: a-elf.o: warning: allocated section `.text' not in segment $ ld a-elf.o b-elf.o -o ab b-elf.o: In function `_f': (.text+0x0): multiple definition of `_f'
Hi Vincent, I just tried your testcase with an i386-pc-netbsdaout toolchain built from the latest mainline binutils sources. First of all, there appears to be a typo in your description. You say: "Further strangeness comes if we convert a.o into ELF, too", but I think that you mean "...convert b.o..." as a.o has already been converted. Secondly - I do not get the problem you report when both files have been converted to ELF: % gas/as-new a.s -o a.o % gas/as-new b.s -o b.o % binutils/objcopy a.o a.elf.o -O elf32-i386 BFD: a.elf.o: warning: allocated section `.text' not in segment % binutils/objcopy b.o b.elf.o -O elf32-i386 BFD: b.elf.o: warning: allocated section `.text' not in segment % ld/ld-new a.elf.o b.elf.o % (Note: the linker did not display the stab warning text as a warning message in this case. This is probably a bug, although it may count as a "feature" if it is because it is very hard to translate aout stabs warning messages into ELF format). I did get the error message about the undefined reference to _f when attempting to link a.elf.o and b.o, so I am going to look into that. But in the meantime I would appreciate it if you could try re-running your testcase with an up to date toolchain to see if you encounter the same behaviour as I did. Cheers Nick
Hi, Nick. I agree with you on the 2 points. 1) Shame on me, there is effectively a typo near the end, it shoud be read: Further strangeness comes if we convert b.o into ELF, too. $ objcopy b.o b-elf.o -O elf32-i386 BFD: b-elf.o: warning: allocated section `.text' not in segment 2) I've just compiled the latest binutils snapshot, linking the 2 converted ELF files success (unlike 2.18). I noticed some changes between the versions. Let say that b-218.o is generated by gas 2.18, and b-cvs is generated by the latest gas from CVS. $ objdump -t b-218.o b-218.o: file format a.out-i386-netbsd SYMBOL TABLE: 00000000 W d *ABS* 0000 00 1e a linker warning 00000000 g *ABS* 0000 00 03 _f 00000000 g .text 0000 00 05 _f $ objdump -t b-cvs.o b-cvs.o: file format a.out-i386-netbsd SYMBOL TABLE: 00000000 W d *ABS* 0000 00 1e a linker warning 00000000 *UND* 0000 00 01 _f 00000000 g .text 0000 00 05 _f We can notice that the symbol referred by the warning was *ABS* and is now *UND*. The behavior has changed... anyway it doesn't matter. That difference is kept after the conversion into ELF : $ objdump -t b-218-elf.o b-218-elf.o: file format elf32-i386 SYMBOL TABLE: 00000000 l *ABS* 00000000 a linker warning 00000000 l d .text 00000000 .text 00000008 l d .data 00000000 .data 00000008 l d .bss 00000000 .bss 00000000 g *ABS* 00000000 _f 00000000 g .text 00000000 _f $ objdump -t b-cvs-elf.o b-cvs-elf.o: file format elf32-i386 SYMBOL TABLE: 00000000 l *ABS* 00000000 a linker warning 00000000 l d .text 00000000 .text 00000008 l d .data 00000000 .data 00000008 l d .bss 00000000 .bss 00000000 *UND* 00000000 _f 00000000 g .text 00000000 _f This may explain that the multiple definition problem has gone. However, I'm quite surprised when I look at the b-cvs-elf.o symbol table : the _f symbol is both undefined and defined in the .text segment ! The first _f is probably the symbol related to the warning, however now it is not just after the warning, there are the section symbols between them, now. This was about the strange things. However the undefined reference problem is still here. We have together the same results with the latest binutils, that's some kind of good news ;-)
Subject: Re: stab warnings cause linker errors Hi Vincent, (I have not forgotten about this problem, but I am really swamped just at the moment). > However, I'm quite surprised when I look at the b-cvs-elf.o symbol table : the > _f symbol is both undefined and defined in the .text segment ! This is because of the stab warning - it creates an undefined reference to the symbol that it is warning about. I did find that if you rearrange your test case so that the declaration of _f in b.s occurs before the .stabs directives then the conversion will work as well. The problem seems to be that when the undefined reference to _f occurs in b.s before the definition, then when b.o is converted into ELF format *two* instances of _f are created, one undefined and one defined, and when a.o is linked in it is this undefined instance that is used, not the defined one. I have so far been unable to locate exactly where these two instances are created in the BFD library. But I am working on it... Cheers Nick
Subject: Re: stab warnings cause linker errors Nick Clifton <nickc@redhat.com> writes: > The problem seems to be that when the undefined reference to _f occurs > in b.s before the definition, then when b.o is converted into ELF > format *two* instances of _f are created, one undefined and one > defined, and when a.o is linked in it is this undefined instance that > is used, not the defined one. I have so far been unable to locate > exactly where these two instances are created in the BFD library. But > I am working on it... If I understand what you are getting at, that's how warning symbols work. Look at MWARN in linker.c. It creates a new symbol, and sets u.i.link to point to the existing symbol. Perhaps some list somewhere is still pointing at the old symbol.
Subject: Re: stab warnings cause linker errors Hi Ian, > If I understand what you are getting at, that's how warning symbols > work. Look at MWARN in linker.c. It creates a new symbol, and sets > u.i.link to point to the existing symbol. Perhaps some list somewhere > is still pointing at the old symbol. Right. It seems that two entries for the same symbol name are created in the hash table and I have yet to find out why. ie an undefined entry for _f is created first and then a defined entry for _f is created afterwards. What I have yet to discover is why the second entry does not just replace the first. I think it must have something to do with the symbol being associated with a warning, but I have not yet found out exactly what is happening. Cheers Nick
This comment in front of generic_link_read_symbols explains why we are running into trouble: /* Grab the symbols for an object file when doing a generic link. We store the symbols in the outsymbols field. We need to keep them around for the entire link to ensure that we only read them once. If we read them multiple times, we might wind up with relocs and the hash table pointing to different instances of the symbol structure. */ The ELF backend does not cache symbols it reads from input files to outsymbols. So ldmain.c:warning_callback reads symbols and relocs from a-elf.o via the ELF backend code, and caches the relocs. Later, linker.c:default_indirect_link_order calls generic_link_read_symbols and gets a different set of symbol pointers which are used to resolve symbol values. Then the generic linker tries to relocate a-elf.o .text section, but the ELF backend returns the cached relocs which are using the wrong set of symbols.
This simple patch seems to cure the problem. Index: bfd/elf.c =================================================================== RCS file: /cvs/src/src/bfd/elf.c,v retrieving revision 1.462 diff -u -p -r1.462 elf.c --- bfd/elf.c 8 Aug 2008 08:00:14 -0000 1.462 +++ bfd/elf.c 15 Aug 2008 10:39:10 -0000 @@ -6545,7 +6545,11 @@ _bfd_elf_canonicalize_symtab (bfd *abfd, long symcount = bed->s->slurp_symbol_table (abfd, allocation, FALSE); if (symcount >= 0) - bfd_get_symcount (abfd) = symcount; + { + bfd_get_symcount (abfd) = symcount; + /* Cache symbols for the generic linker. */ + bfd_get_outsymbols (abfd) = allocation; + } return symcount; }
I confirm the previous patch applied to binutils 2.18 fixes the testcase, as well as my original real-world problem. Furthermore, the linker warnings present in a.out object files are displayed as expected when they are linked against ELF object files. Good job Alan, you just have to commit !
http://sourceware.org/ml/binutils/2008-08/msg00164.html
I confirm everything is fixed in the current CVS version. Many thanks for your great job, Alan.