Bug 16698 - BFD (GNU Binutils) 2.24 assertion fail elf32-arm.c:12387
Summary: BFD (GNU Binutils) 2.24 assertion fail elf32-arm.c:12387
Status: RESOLVED OBSOLETE
Alias: None
Product: binutils
Classification: Unclassified
Component: binutils (show other bugs)
Version: 2.24
: P2 normal
Target Milestone: ---
Assignee: Not yet assigned to anyone
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2014-03-12 14:53 UTC by maillist-gdb
Modified: 2014-10-25 16:43 UTC (History)
6 users (show)

See Also:
Host:
Target:
Build:
Last reconfirmed:


Attachments
testcase (2.55 KB, application/x-gzip)
2014-06-13 02:47 UTC, maillist-gdb
Details
testcase, links successfully on other archs (3.53 KB, application/x-gzip)
2014-06-13 18:58 UTC, maillist-gdb
Details
Built executable (42.20 KB, application/x-executable)
2014-06-16 12:29 UTC, Nick Clifton
Details
generated object files (3.36 KB, application/x-gzip)
2014-06-28 10:50 UTC, maillist-gdb
Details
testcase including all object files and stripped down libc.a (65.57 KB, application/x-gzip)
2014-07-01 12:45 UTC, maillist-gdb
Details
standalone testcase including reduced libc sources (9.30 KB, application/x-gzip)
2014-07-01 18:57 UTC, maillist-gdb
Details
manually reduced testcase to the bare minimum (2.93 KB, application/x-gzip)
2014-07-01 21:11 UTC, maillist-gdb
Details
updated testcase (3.12 KB, application/x-gzip)
2014-10-25 11:39 UTC, maillist-gdb
Details

Note You need to log in before you can comment on or make changes to this bug.
Description maillist-gdb 2014-03-12 14:53:01 UTC
when building git, util-linux, or lzo with the following CFLAGS:
-fdata-sections -ffunction-sections -Os -g0 
-fno-unwind-tables -fno-asynchronous-unwind-tables -Wa,--noexecstack -ftree-dce
and the following LDFLAGS:
-s -Wl,--gc-sections -Wl,-z,relro,-z,now

the assertion in the title will be triggered.

example:
/root:/src/build/git/git-1.8.4$ gcc  -fdata-sections -ffunction-sections -Os -g0 
-fno-unwind-tables -fno-asynchronous-unwind-tables -Wa,--noexecstack -ftree-dce 
-I. -DNO_GETTEXT  -DHAVE_PATHS_H -DHAVE_DEV_TTY -DSHA1_HEADER='<openssl/sha.h>' 
 -DNO_STRLCPY -DUSE_WILDMATCH -DNO_MKSTEMPS -DSHELL_PATH='"/bin/sh"' -o git-cred
ential-store -s -Wl,--gc-sections -Wl,-z,relro,-z,now  credential-store.o libgit
.a xdiff/lib.a  -lz  -lcrypto -lpthread
/bin/ld: BFD (GNU Binutils) 2.24 assertion fail elf32-arm.c:12387
/bin/ld: BFD (GNU Binutils) 2.24 assertion fail elf32-arm.c:12387
collect2: error: ld returned 1 exit statusbin/ld: BFD (GNU Binutils) 2.24 assertion fail elf32-arm.c:12387
collect2: error: ld returned 1 exit status

some tests revealed that it's the combination of -s and -Wl,--gc-sections that causes the hiccup, when dynamic linking is involved. maybe it's needed that these flags have been applied to both the dso's and the program involved.
i've failed to produce a proper testcase.

the issue was not existent in binutils 2.22
Comment 1 maillist-gdb 2014-03-12 14:54:32 UTC
the bug #14189 ( https://sourceware.org/bugzilla/show_bug.cgi?id=14189 ) may be related.
Comment 2 maillist-gdb 2014-06-12 17:06:10 UTC
ping.
anyone up for bisecting this?
Comment 3 maillist-gdb 2014-06-12 23:02:27 UTC
the bug is present in 2.23.1, 2.23.2, 2.24, and 2.24.51 snapshot from today, but not in 2.22
Comment 4 maillist-gdb 2014-06-13 02:47:52 UTC
Created attachment 7634 [details]
testcase

it was probably broken by the same commit that broke https://sourceware.org/bugzilla/show_bug.cgi?id=14189#c3

here is a reduced testcase (created with delta, could probably be further reduced with c-reduce).
Comment 5 Hans-Peter Nilsson 2014-06-13 06:54:33 UTC
Please don't add random maintainers to CC.
Comment 6 Nick Clifton 2014-06-13 15:43:32 UTC
The testcase is missing a definition of the function pcap_offline_read().

If a dummy one is supplied then the test compiles and links without any problems - except for a few warnings from gcc about assignments making pointers from integers - using the latest gcc and binutils sources.
Comment 7 maillist-gdb 2014-06-13 18:58:49 UTC
Created attachment 7638 [details]
testcase, links successfully on other archs

(In reply to Nick Clifton from comment #6)
> The testcase is missing a definition of the function pcap_offline_read().
> 
> If a dummy one is supplied then the test compiles and links without any
> problems - except for a few warnings from gcc about assignments making
> pointers from integers - using the latest gcc and binutils sources.

sorry, here's a testcase that would link successfully if the assertion was not raised (and does so with a different binutils version or arch).
Comment 8 Nick Clifton 2014-06-16 12:29:23 UTC
Created attachment 7639 [details]
Built executable

I must be doing something wrong, because I am able to build the executable.  (See the uploaded file).

Could you upload your libtest1.a and test.o files ?  Maybe I can use these to reproduec the problem.

Cheers
  Nick
Comment 9 maillist-gdb 2014-06-28 10:50:10 UTC
Created attachment 7663 [details]
generated object files

sorry for the delay; back from holidays.
here are the generated obj files. since you do not experience the same problem, i suspect it may be related to the 2 external symbols that get pulled in:

0000026c  00000e1c R_ARM_CALL        00000000   abort
000004d8  00000c1c R_ARM_CALL        00000000   __aeabi_uidiv

one is of libc and the other of libgcc.

arm-linux-musleabi-readelf -a `arm-linux-musleabi-gcc -print-libgcc-file-name` | grep __aeabi_uidiv | head
    17: 00000000     0 FUNC    GLOBAL HIDDEN     1 __aeabi_uidiv

$ arm-linux-musleabi-readelf -a $HOME/musl-cross-4.8.3/arm-linux-musleabi/arm-linux-musleabi/lib/libc.a | grep abort
File: $HOME/musl-cross-4.8.3/arm-linux-musleabi/arm-linux-musleabi/lib/libc.a(abort.o)
     1: 00000000     0 FILE    LOCAL  DEFAULT  ABS abort.c
    16: 00000000    24 FUNC    GLOBAL DEFAULT    1 abort
00000034  0000161c R_ARM_CALL        00000000   abort
    22: 00000000     0 NOTYPE  GLOBAL DEFAULT  UND abort
Comment 10 Nick Clifton 2014-06-30 13:59:13 UTC
Sorry - I still cannot reproduce this problem, even with the object file and library. :-(  The external symbols that I am pulling in look the same as the ones you use:

 % readelf --syms ../../arm-eabi/libgcc/libgcc.a | grep uidiv
    15: 00000000     0 FUNC    GLOBAL HIDDEN     1 __aeabi_uidiv

  % readelf --syms ../../arm-eabi/newlib/libc.a | grep abort
    15: 00000000    20 FUNC    GLOBAL DEFAULT    1 abort

So I guess it must be a host-specific problem.  Maybe you could run the link under GDB and find out some more about why the assert is being triggered ?

Cheers
  Nick
Comment 11 maillist-gdb 2014-06-30 20:32:33 UTC
according to the contents of the "abfd" variable when the assert is raised,
it's caused by the stdin.o and fflush.o object files in libc.a, which both do some weak symbol magic to pull in specific functions or data only when they're actually used.
i'm not fully understanding yet what's happening there...

the code in question is
http://git.etalabs.net/cgit/musl/tree/src/stdio/fflush.c#n23
http://git.etalabs.net/cgit/musl/tree/src/stdio/stdin.c#n15 (variable is defined here http://git.etalabs.net/cgit/musl/tree/src/stdio/__stdio_exit.c#n4 )

it seems they're getting pulled in via crt1.o -> __libc_start_main -> exit

if i can find a way to get ld to list all the object files it pulls in from libc.a, i could extract those and attach them here.
Comment 12 Rich Felker 2014-07-01 02:14:00 UTC
For musl libc.a, neither stdin.o nor fflush.o should be pulled in unless they're actually used. For stdin.o, that means referencing stdin itself or a function (like scanf or getchar) that explicitly uses stdin. For fflush.o, the users are assert, getpass, fclose, freopen, and the stdio_ext.h functions. So this seems wrong:

> it seems they're getting pulled in via crt1.o -> __libc_start_main -> exit

As for:

> if i can find a way to get ld to list all the object files it pulls in from
> libc.a, i could extract those and attach them here.

Won't -Wl,-M do this? Or you could just look at a non-stripped output binary with debug symbols, which should show the object file filenames that were linked.
Comment 13 maillist-gdb 2014-07-01 12:45:01 UTC
Created attachment 7670 [details]
testcase including all object files and stripped down libc.a

savefile.c in the testcase uses both stdin and fflush - i wonder how i could miss that.

using -Wl,-M i hunted down all referenced objects files and put them into a mini libc.a, and added musl's crt files.

Nick, please see attached tarball, it contains everything needed to reproduce the issue - the only external thing getting pulled in is libgcc.

the linker command used by gcc was extracted via strace, simplified and put into link.sh.

the file test.elf.wlm contains the output of -Wl,-M.
Comment 14 maillist-gdb 2014-07-01 18:57:16 UTC
Created attachment 7671 [details]
standalone testcase including reduced libc sources

while trying to reduce the libc sources that got linked in, i found that the assertion is only triggered if libc is compiled with "-g".

here is another testcase including delta-reduced libc sources (+ makefile changes).

this one still triggers in stdin.o, while the "real" object files in my previous comment raise a total of 3 assertions in stdin.o and fflush.o.
Comment 15 maillist-gdb 2014-07-01 21:11:09 UTC
Created attachment 7672 [details]
manually reduced testcase to the bare minimum

i manually reduced libc/stdin.c to the following

typedef struct _IO_FILE FILE;
struct _IO_FILE { int lock; };
static FILE f = {  .lock = -1};
FILE *const (stdin) = &f;

this construct, compiled with -g triggers the bug in another object that references it in an unused function when linked with -s --gc-sections...

for example:

 typedef struct _IO_FILE FILE;
extern FILE *const stdin;

typedef struct pcap pcap_t;
struct pcap_sf {  FILE *rfile;  };
struct pcap {       struct pcap_sf sf;  };

static void unused_func_referencing_stdin(pcap_t *p) {
   if (p->sf.rfile != (stdin))   (void)fclose(p->sf.rfile);
}

int pcap_loop(pcap_t *p, int cnt, void* callback, char *user) {   }


i reduced the testcase again to the bare minimum, it's just 2 files and 2 libc files now with a total of about 20 lines
Comment 16 Nick Clifton 2014-07-02 14:23:21 UTC
... and yet the testcase compiles and links without any problems:

 % make
 arm-linux-gnueabi-gcc -std=c99 -nostdinc -ffreestanding -g -fno-stack-protector -c -o libc/__libc_start_main.o libc/__libc_start_main.c
 arm-linux-gnueabi-gcc -std=c99 -nostdinc -ffreestanding -g -fno-stack-protector -c -o libc/exit.o libc/exit.c
 arm-linux-gnueabi-gcc -std=c99 -nostdinc -ffreestanding -g -fno-stack-protector -c -o libc/stdin.o libc/stdin.c
 rm -f libc.a
 arm-linux-gnueabi-ar rc libc.a libc/__libc_start_main.o libc/exit.o libc/stdin.o
 arm-linux-gnueabi-ranlib libc.a
 arm-linux-gnueabi-gcc -ffunction-sections -fdata-sections -s -g0 -c -o test.o test.c
 arm-linux-gnueabi-gcc -ffunction-sections -fdata-sections -s -g0 -c -o pcap.o pcap.c
 rm -f libtest1.a
 arm-linux-gnueabi-ar rc libtest1.a pcap.o
 arm-linux-gnueabi-ld -Bstatic -X -m armelf_linux_eabi -o test.elf -s crt/crt1.o crt/crti.o /arm-linux-gnueabi/libgcc/crtbeginT.o \
-L . -L /arm-linux-gnueabi/libgcc test.o --gc-sections -ltest1 --start-group -lgcc_eh -lgcc -lc --end-group \
--start-group -lgcc -lgcc_eh -lc --end-group /arm-linux-gnueabi/libgcc/crtend.o crt/crtn.o
  %


What type of host machine are you using to build this test case ?
Comment 17 maillist-gdb 2014-07-02 14:52:19 UTC
the "host" (i.e. the machine the compiler runs on) is x86_64-unknown-linux-gnu (sabotage linux using musl libc).

binutils 2.24 was built with these flags:
--target=arm-linux-musleabi

and the entire toolchain using these scripts:
https://github.com/GregorR/musl-cross

does your /arm-linux-gnueabi/libgcc directory contain a libc.a as well ?
this could prevent the built libc.a from getting used.
using the testcase here, it doesn't (checked with strace -f make 2>&1 | grep open | grep -v ENOENT)
Comment 18 maillist-gdb 2014-07-02 20:14:31 UTC
i was able to reproduce the issue on a glibc box (opensuse 11.2) with the latest testcase and pre-compiled musl-cross toolchain from https://e82b27f594c813a5a4ea5b07b06f16c3777c3b8c.googledrive.com/host/0BwnS5DMB0YQ6bDhPZkpOYVFhbk0/musl-1.1.1/crossx86-arm-linux-musleabi-1.1.1.tar.xz

but not with the precompiled linaro arm-linux-gnueabihf *hardfloat* toolchain from
http://releases.linaro.org/14.04/components/toolchain/binaries/gcc-linaro-arm-linux-gnueabihf-4.8-2014.04_linux.tar.xz

i was, however able to reproduce the issue with the latest codesourcery toolchain:
https://sourcery.mentor.com/GNUToolchain/package12774/public/arm-none-eabi/arm-2014.05-28-arm-none-eabi-i686-pc-linux-gnu.tar.bz2

supplying a stub memcpy.c
int memcpy(void *a, void *b, long n) { return 0; }
and a stub abort.c
void abort(void) { for(;;); }
in libc/ to satisfy libgcc dependencies of the codesourcery toolchain,
i get this:

arm-none-linux-gnueabi-ld -Bstatic -X -m armelf_linux_eabi -o test.elf -s crt/crt1.o crt/crti.o ~/musl-cross/arm-2014.05/lib/gcc/arm-none-linux-gnueabi/4.8.3//crtbeginT.o \
-L . -L ~/musl-cross/arm-2014.05/lib/gcc/arm-none-linux-gnueabi/4.8.3/ test.o --gc-sections -ltest1 --start-group -lgcc_eh -lgcc -lc --end-group \
--start-group -lgcc -lgcc_eh -lc --end-group ~/musl-cross/arm-2014.05/lib/gcc/arm-none-linux-gnueabi/4.8.3//crtend.o crt/crtn.o
arm-none-linux-gnueabi-ld: BFD (Sourcery CodeBench Lite 2014.05-29) 2.24.51.20140217 assertion fail /scratch/maciej/arm-linux-2014.05-rel/obj/binutils-src-2014.05-29-arm-none-linux-gnueabi-i686-pc-linux-gnu/bfd/elf32-arm.c:12478
make: *** [test.elf] Error 1
Comment 19 maillist-gdb 2014-07-02 20:17:17 UTC
oops wrong link, the codesourcery toolchain which raised the assertion was

https://sourcery.mentor.com/GNUToolchain/package12813/public/arm-none-linux-gnueabi/arm-2014.05-29-arm-none-linux-gnueabi-i686-pc-linux-gnu.tar.bz2
Comment 20 maillist-gdb 2014-08-06 20:32:48 UTC
(In reply to maillist-gdb from comment #18)
> i was able to reproduce the issue on a glibc box (opensuse 11.2) with the
> latest testcase 
> i was, however able to reproduce the issue with the latest codesourcery
> toolchain:
> https://sourcery.mentor.com/GNUToolchain/package12774/public/arm-none-eabi/
> arm-2014.05-28-arm-none-eabi-i686-pc-linux-gnu.tar.bz2

to clarify, i used the (latest) codesourcery toolchain on the above mentioned openSuSE box, and could reproduce the issue.

the toolchain however is the one from the link of comment #19

Nick, would you care to take another look with the above mentioned toolchain?
(and after you've seen the bug, maybe updating the binutils version it uses)

Since i spent nearly 2 days of effort to hunt down the issue; it would be a pity if this PR dies without getting the issue solved.
Comment 21 Nick Clifton 2014-08-19 13:57:35 UTC
I am sorry but this bug is just not reproducible with the FSF mainline binutils sources. :-( I can only conclude that the bug must be something to do with whatever patches CodeSourcery have applied to their toolchain.

Nick
Comment 22 maillist-gdb 2014-08-19 15:07:15 UTC
are you sure you're not using a hardfloat toolchain ? those seem to be immune to the bug. anything else i tested is affected.
Comment 23 maillist-gdb 2014-10-25 11:39:29 UTC
Created attachment 7847 [details]
updated testcase

updated testcase: less code, better makefile (libgcc dir is automatically found)
Comment 24 maillist-gdb 2014-10-25 11:51:15 UTC
(In reply to Nick Clifton from comment #21)
> I am sorry but this bug is just not reproducible with the FSF mainline
> binutils sources. :-( I can only conclude that the bug must be something to
> do with whatever patches CodeSourcery have applied to their toolchain.

the bug happens with vanilla unpatched binutils 2.24, as well as with the codesourcery toolchain, so it's definitely not due to custom codesourcery patches

(In reply to maillist-gdb from comment #22)
> are you sure you're not using a hardfloat toolchain ? those seem to be
> immune to the bug. anything else i tested is affected.

i just tested with a musl-cross arm-eabihf toolchain, the bug exists there as well.
so it's interesting that it doesnt happen with the linaro hardfloat toolchain.
Comment 25 maillist-gdb 2014-10-25 16:43:10 UTC
according to tests done with latest binutils snapshot (2.24.90) the issue is now fixed. at least my testcase doesn't trigger the assert anymore.