Bug 9934 - arm gnueabi linker often fails with FPE error while linking shared libs
Summary: arm gnueabi linker often fails with FPE error while linking shared libs
Status: RESOLVED FIXED
Alias: None
Product: binutils
Classification: Unclassified
Component: ld (show other bugs)
Version: 2.19
: P2 normal
Target Milestone: ---
Assignee: unassigned
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2009-03-08 23:47 UTC by Mike Frysinger
Modified: 2009-03-14 11:39 UTC (History)
4 users (show)

See Also:
Host: x86_64-pc-linux-gnu
Target: arm-softfloat-linux-gnueabi
Build:
Last reconfirmed:


Attachments
arm-softfloat-linux-gnueabi-ld-libgcc-FPE.tar.bz2 (85.66 KB, application/octet-stream)
2009-03-08 23:47 UTC, Mike Frysinger
Details
Allow object files with relocs but no symbol table. (1.57 KB, patch)
2009-03-10 17:51 UTC, Nick Clifton
Details | Diff
Gracefully handle object files with relocs but no symbol tables. (1.40 KB, patch)
2009-03-13 11:35 UTC, Nick Clifton
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Mike Frysinger 2009-03-08 23:47:17 UTC
while building gcc's shared libs, we end up with this error:
collect2: ld terminated with signal 8 [Floating point exception]

i'm attaching a tarball of the objects from glibc/gcc which are used to trigger
this bug.  to reproduce, hopefully you should be able to extract and just run
`./doit.sh` (might have to change the linker invoked).

Raúl Porcel traced this back to a change between 2.18.50.0.4 and 2.18.50.0.5. 
specifically, this commit seems to be the problem:

    2008-02-20  Paul Brook  <paul@codesourcery.com>
    
    	ld/
    	* emultempl/armelf.em (OPTION_FIX_V4BX_INTERWORKING): Define.
    	(PARSE_AND_LIST_LONGOPTS): Add fix-v4bx-interworking.
    	(PARSE_AND_LIST_OPTIONS): Ditto.
    	(PARSE_AND_LIST_ARGS_CASES): Handle OPTION_FIX_V4BX_INTERWORKING.
    	* emulparams/armelf.sh (OTHER_TEXT_SECTIONS): Add .v4_bx.
    	* emulparams/armelf_linux.sh (OTHER_TEXT_SECTIONS): Ditto.
    	* emulparams/armnto.sh (OTHER_TEXT_SECTIONS): Ditto.
    	* ld.texinfo: Document --fix-v4bx-interworking.
Comment 1 Mike Frysinger 2009-03-08 23:47:38 UTC
Created attachment 3800 [details]
arm-softfloat-linux-gnueabi-ld-libgcc-FPE.tar.bz2
Comment 2 Nick Clifton 2009-03-10 17:51:35 UTC
Created attachment 3806 [details]
Allow object files with relocs but no symbol table.
Comment 3 Nick Clifton 2009-03-10 17:53:59 UTC
Hi Mike,

  The crtn.o file in your tarball is the culprit - it contains relocs that do
not not refer to any symbols.  Since those are the only kind of relocs in that
file there is no symbol table either and this is what is confusing the BFD library.

  Please try the uploaded patch which should work for your tarball but maybe not
for other tests.  (I did not try to track down all the places where the
assumption is made that relocs==symbols).

Cheers
  Nick
Comment 4 Mike Frysinger 2009-03-10 23:13:36 UTC
that does fix the crash, but now ld exits with 1 without any error message
(patch applied to binutils-2.19.1)

running with --verbose shows no info as to why ld decided to exit(1) ...
Comment 5 Nick Clifton 2009-03-11 10:40:20 UTC
Hi Mike,

> running with --verbose shows no info as to why ld decided to exit(1) ...

  Yeah my bad.  Delete the last frag of the patch to elflink.c, (the one that
calls bfd_set_error) and you will get your error message back.  That bit was
totally bogus.

  There is obviously another bit of the linker that assumes that
relocs==symbols.  Tracking it down is not going to be easy though.  *sigh*.  By
the way, if you omit crtn.o from the link then everything works just fine...

Cheers
  Nick
Comment 6 Mike Frysinger 2009-03-11 11:52:15 UTC
while true, this is coming from building gcc, so interrupting the process to
manually run things is kind of a pain ... plus, that just gets us pass the
libgcc_s.so stage.  iirc, anything linking against crtn.o will hit a similar
problem.

it isnt a huge blocker for us in Gentoo as we can simply revert to using
binutils-2.18 in the meantime ... version selection of tools is trivial

i havent looked at glibc's crtn.o closely ... i'm assuming that the object isnt
doing something stupid, so we dont have to pursue that separately ?
Comment 7 H.J. Lu 2009-03-12 04:25:29 UTC
Is that reproducible with a cross linker on Linux/ia32?
Comment 8 Mike Frysinger 2009-03-12 04:37:41 UTC
i dont think the host matters, but people who have tested so far seem to be
amd64 users only ...

simply running the linker on the crtn.o object should also trigger the crash i
think ...
arm-softfloat-linux-gnueabi-ld crtn.o
Comment 9 Raúl Porcel 2009-03-12 10:23:15 UTC
This also happens on arm native.

If i build >=binutils-2.18.50.0.5 and rebuild glibc, all compilations break
after that.
Comment 10 H.J. Lu 2009-03-12 18:30:21 UTC
(In reply to comment #8)
> i dont think the host matters, but people who have tested so far seem to be
> amd64 users only ...
> 
> simply running the linker on the crtn.o object should also trigger the crash i
> think ...
> arm-softfloat-linux-gnueabi-ld crtn.o

How crtn.o is generated? Can you provide crtn.s and the assembly command used
to generate crtn.o?
Comment 11 H.J. Lu 2009-03-12 18:36:12 UTC
Did you run "strip --strip-unneeded" on crtn.o? 
Comment 12 H.J. Lu 2009-03-12 18:46:52 UTC
I think this may be a dup for PR 9945.
Comment 13 Mike Frysinger 2009-03-12 21:02:23 UTC
yes, we use --strip-unneeded on the .o files.  even if PR 9945 were fixed, the
linker shouldnt fail with an FPE error.
Comment 14 H.J. Lu 2009-03-12 21:22:20 UTC

*** This bug has been marked as a duplicate of 9945 ***
Comment 15 Mike Frysinger 2009-03-13 06:58:21 UTC
as i said, the linker should not be crashing.  even if the strip issue is fixed,
the linker is misbehaving here.
Comment 16 Sourceware Commits 2009-03-13 11:34:59 UTC
Subject: Bug 9934

CVSROOT:	/cvs/src
Module name:	src
Changes by:	nickc@sourceware.org	2009-03-13 11:34:43

Modified files:
	bfd            : ChangeLog elf-bfd.h elflink.c elf32-arm.c 

Log message:
	PR 9934
	* elf-bfd.h (NUM_SHDR_ENTRIES): Cope with an empty section.
	* elflink.c (elf_link_read_relocs_from_section): Use
	NUM_SHDR_ENTRIES.  Gracefully handle the case where there are
	relocs but no symbol table.
	* elf32-arm.c (elf32_arm_check_relocs): Likewise.

Patches:
http://sources.redhat.com/cgi-bin/cvsweb.cgi/src/bfd/ChangeLog.diff?cvsroot=src&r1=1.4496&r2=1.4497
http://sources.redhat.com/cgi-bin/cvsweb.cgi/src/bfd/elf-bfd.h.diff?cvsroot=src&r1=1.280&r2=1.281
http://sources.redhat.com/cgi-bin/cvsweb.cgi/src/bfd/elflink.c.diff?cvsroot=src&r1=1.330&r2=1.331
http://sources.redhat.com/cgi-bin/cvsweb.cgi/src/bfd/elf32-arm.c.diff?cvsroot=src&r1=1.180&r2=1.181

Comment 17 Nick Clifton 2009-03-13 11:35:41 UTC
Created attachment 3820 [details]
Gracefully handle object files with relocs but no symbol tables.
Comment 18 Nick Clifton 2009-03-13 11:38:22 UTC
Hi Mike,

  I have checked in a revised version of my previous patch (uploaded) which will
stop the linker from seg-faulting.  It will still refuse to produce an
executable because of the non-representable section (in the crtn.o) file, but
that is now the correct behaviour.  Once the patch for 9945 is applied and a new
stripped version of crtn.o is produced the entire problem should go away.

Cheers
  Nick

bfd/ChangeLog
2009-03-13  Nick Clifton  <nickc@redhat.com>

	PR 9934
	* elf-bfd.h (NUM_SHDR_ENTRIES): Cope with an empty section.
	* elflink.c (elf_link_read_relocs_from_section): Use
	NUM_SHDR_ENTRIES.  Gracefully handle the case where there are
	relocs but no symbol table.
	* elf32-arm.c (elf32_arm_check_relocs): Likewise.
Comment 19 H.J. Lu 2009-03-13 13:38:17 UTC
(In reply to comment #17)
> Created an attachment (id=3820)
> Gracefully handle object files with relocs but no symbol tables.
> 

Personally, I think such input is invalid. We should just stop when
we see such input.  There is no need to accept it at all.
Comment 20 Mike Frysinger 2009-03-14 11:39:10 UTC
thanks guys, the patches seem to work for us