Bug 24074 - ld fails silently when linking MinGW 64-bit app with BOINC libs
Summary: ld fails silently when linking MinGW 64-bit app with BOINC libs
Status: UNCONFIRMED
Alias: None
Product: binutils
Classification: Unclassified
Component: binutils (show other bugs)
Version: 2.25
: P2 normal
Target Milestone: ---
Assignee: Not yet assigned to anyone
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2019-01-07 23:43 UTC by Daniel
Modified: 2019-12-29 16:38 UTC (History)
2 users (show)

See Also:
Host:
Target:
Build:
Last reconfirmed:
Project(s) to access:
ssh public key:


Attachments
Captured logs (88.21 KB, application/x-zip-compressed)
2019-01-07 23:43 UTC, Daniel
Details
Broken BOINC libs (1.03 MB, application/x-xz)
2019-01-13 13:29 UTC, Daniel
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Daniel 2019-01-07 23:43:59 UTC
Created attachment 11520 [details]
Captured logs

I was trying to link 64-bit MinGW app using crosscompiler on CentOS Linux. Unfortunately it failed, and ld did not produce any error message - I only got "collect2: error: ld returned 1 exit status". This does not happen when linking 32-bit MinGW toolchain.

Here are details how to reproduce this, and results of my attempts to investigate this:

- System: CentOS Linux release 7.6.1810 (Core)

- MinGW Binutils version:
/usr/bin/x86_64-w64-mingw32-ld --version
GNU ld (GNU Binutils) 2.25
Copyright (C) 2014 Free Software Foundation, Inc.
This program is free software; you may redistribute it under the terms of
the GNU General Public License version 3 or (at your option) a later version.
This program has absolutely no warranty.

- Installed MinGW 64bit toolchain from EPEL repo (mingw64-gcc-c++.x86_64, version 4.9.3-1.el7)

- Compiled gcc 8.2.0 for MinGW, configured in following way:
/root/gcc-8.2.0-mingw64/bin/x86_64-w64-mingw32-gcc -v
Using built-in specs.
COLLECT_GCC=/root/gcc-8.2.0-mingw64/bin/x86_64-w64-mingw32-gcc
COLLECT_LTO_WRAPPER=/root/gcc-8.2.0-mingw64/libexec/gcc/x86_64-w64-mingw32/8.2.0/lto-wrapper
Target: x86_64-w64-mingw32
Configured with: ../gcc-8.2.0-src/configure --prefix=/root/gcc-8.2.0-mingw64 --build=x86_64-redhat-linux-gnu --host=x86_64-redhat-linux-gnu --with-gnu-as --with-gnu-ld --verbose --without-newlib --disable-multilib --disable-plugin --with-system-zlib --disable-nls --without-included-gettext --disable-win32-registry --enable-languages=c,c++ --enable-threads=posix --enable-libgomp --target=x86_64-w64-mingw32 --with-sysroot=/usr/x86_64-w64-mingw32/sys-root --with-gxx-include-dir=/usr/x86_64-w64-mingw32/sys-root/mingw/include/c++ --with-as=/usr/bin/x86_64-w64-mingw32-as --with-ld=/usr/bin/x86_64-w64-mingw32-ld
Thread model: posix
gcc version 8.2.0 (GCC)

- Cloned latest BOINC vesion from https://github.com/BOINC/boinc

- Compiled BOINC libs for MinGW 64-bit target:

cd <repo>/lib

mkdir -p /root/boinc82/mingw64/

make -f Makefile.mingw CC=/root/gcc-8.2.0-mingw64/bin/x86_64-w64-mingw32-gcc CXX=/root/gcc-8.2.0-mingw64/bin/x86_64-w64-mingw32-g++ AR=x86_64-w64-mingw32-ar RANLIB=x86_64-w64-mingw32-ranlib BOINC_PREFIX=/root/boinc82/mingw64/

make -f Makefile.mingw CC=/root/gcc-8.2.0-mingw64/bin/x86_64-w64-mingw32-gcc CXX=/root/gcc-8.2.0-mingw64/bin/x86_64-w64-mingw32-g++ AR=x86_64-w64-mingw32-ar RANLIB=x86_64-w64-mingw32-ranlib BOINC_PREFIX=/root/boinc82/mingw64/ install

- created file a.cpp with:
[code]
extern "C" {
  int boinc_init();
}

int main()
{
  boinc_init();
  return 0;
}
[/code]

- comiled it:
x86_64-w64-mingw32-g++ -c -o a.o a.cpp -pthread

- linked:
x86_64-w64-mingw32-g++ -static a.o -o test -pthread /root/boinc82/mingw64//lib/libboinc_api.a /root/boinc82/mingw64//lib/libboinc.a; echo $?
collect2: error: ld returned 1 exit status
1

- I tried to add -Wl,--verbose to see more details. Output seems valid for me, it ends as follow. Full output is attached.

attempt to open /root/gcc-8.2.0-mingw64/lib/gcc/x86_64-w64-mingw32/8.2.0/crtend.o succeeded
/root/gcc-8.2.0-mingw64/lib/gcc/x86_64-w64-mingw32/8.2.0/crtend.ocollect2: error: ld returned 1 exit status

- I also tried to use strace -f <cmd>, output attached. It seems that read operation at line 22467 returned some unexpected data, and ld started cleanup sequence (series of munmap, close calls follows):

[pid 18361] lseek(8, 167936, SEEK_SET)  = 167936
[pid 18361] read(8, "NSs6assignEPKc\0_ZNKSt6vectorISsS"..., 4096) = 4096
[pid 18361] read(8, "\0\0\0\0\0\0\0\0\0\0\275\0\0\0\0\0\t\0Z\1\0\0\0\0\0\0\0\0\0\0\0\0"..., 122880) = 122880
[pid 18361] read(8, "\0\0\2\0\0\0\2\0\0\0 \24\0\0\0\0\0\0\304\1\0\0\0\0\0\0\2\0\0\0\2\0"..., 4096) = 4096
[pid 18361] munmap(0x7ff9ccca6000, 23789568) = 0
[pid 18361] munmap(0x7ff9d5d69000, 270336) = 0
[pid 18361] munmap(0x7ff9ceae6000, 622592) = 0
[pid 18361] close(8)                    = 0

- I also tried to run this under gdb:

gdb --args x86_64-w64-mingw32-g++ -static a.o -o test -pthread /root/boinc82/mingw64//lib/libboinc_api.a /root/boinc82/mingw64//lib/libboinc.a

b exit
b _exit
set follow-fork-mode child
set detach-on-fork on
set follow-exec-mode same
r

Starting program: /root/gcc-8.2.0-mingw64/bin/x86_64-w64-mingw32-g++ -static a.o -o test -pthread /root/boinc82/mingw64//lib/libboinc_api.a /root/boinc82/mingw64//lib/libboinc.a
[New process 18471]
process 18471 is executing new program: /root/gcc-8.2.0-mingw64/libexec/gcc/x86_64-w64-mingw32/8.2.0/collect2
[New process 18472]
process 18472 is executing new program: /usr/bin/x86_64-w64-mingw32-ld
[Switching to process 18472]

Breakpoint 1, __GI_exit (status=1) at exit.c:98
98      {
Missing separate debuginfos, use: debuginfo-install libgcc-4.8.5-36.el7.x86_64 zlib-1.2.7-18.el7.x86_64
(gdb) bt
#0  __GI_exit (status=1) at exit.c:98
#1  0x00000000004cb12f in xexit (code=code@entry=1) at ../../libiberty/xexit.c:51
#2  0x0000000000418160 in ldwrite () at ../../ld/ldwrite.c:590
#3  0x0000000000403bed in main (argc=57, argv=0x7fffffffd938) at ../../ld/ldmain.c:427
(gdb)
Comment 1 Daniel 2019-01-08 00:12:42 UTC
I also tried to copy .o/.a files to my Windows machine, and use MinGW toolchain from latest Cygwin, with binutils 2.29.1.20171006. This time ld crashed with messages as below. I tried to run this under gdb, but for some reason it did not catch this signal.

collect2: fatal error: ld terminated with signal 11 [Segmentation fault], core dumped
compilation terminated.
/usr/lib/gcc/x86_64-w64-mingw32/7.4.0/../../../../x86_64-w64-mingw32/bin/ld: BFD (GNU Binutils) 2.29.1.20171006 assertion fail /cygdrive/i/szsz/tmpp/cygwin64/mingw64-x86_64/mingw64-x86_64-binutils-2.29.1.787c9873-1.x86_64/src/binutils-gdb/bfd/cofflink.c:265

Here is contents of ld.exe.stackdump file:

Exception: STATUS_ACCESS_VIOLATION at rip=0000000003E
rax=00000001004EFC80 rbx=0000000600137010 rcx=0000000600137010
rdx=0000000600137E18 rsi=00000001004ECF38 rdi=0000000000000001
r8 =00000000FFFFC650 r9 =0000000180144F30 r10=0000000100000000
r11=000000010043747B r12=00000000FFFFC650 r13=0000000600137E18
r14=000000060013DA80 r15=0000000100574C00
rbp=00000006001295A0 rsp=00000000FFFFC588
program=C:\cygwin64\usr\x86_64-w64-mingw32\bin\ld.exe, pid 140, thread main
cs=0033 ds=002B es=002B fs=0053 gs=002B ss=002B
Stack trace:
Frame        Function    Args
006001295A0  0000000003E (00100524F48, 0010053F5E0, 00000000109, 00000000001)
006001295A0  00100453FA7 (00100574C00, 006001295A0, 0060012DC90, 00100574C00)
006001295A0  0010045496E (001004ECF38, 00000000000, 00100428B2B, 000FFFFC77C)
006001295A0  0010043C85C (00100574C00, 00100454910, 001800BF54C, 00000000000)
00000000000  0010040F1E8 (0010050499C, 00600074510, 0010050499C, 00000000000)
00000000001  0010040FD20 (000FFFFCAB0, 0000000003A, 001800BF7C2, 000FFFFCAB0)
0010050FB40  00100411E16 (003EBDC91C0, 006000003A0, 000FFFFCBC0, 00180275184)
0010050FB40  001004EA3C6 (000FFFFCAB0, 006000003B8, 00000000000, 006000003C4)
000FFFFCCD0  00180049E16 (00000000000, 00000000000, 00000000000, 00000000000)
00000000000  00180047973 (00000000000, 00000000000, 00000000000, 00000000000)
000FFFFFFF0  00180047A24 (00000000000, 00000000000, 00000000000, 00000000000)
End of stack trace
Comment 2 Nick Clifton 2019-01-09 17:02:01 UTC
(In reply to Daniel from comment #0)

Hi Daniel,

> I was trying to link 64-bit MinGW app using crosscompiler on CentOS Linux.

Is the bionic library very big ?  Mysterious failures like this are often 
caused by the system running out of resources.  Usually memory or disk space.

> GNU ld (GNU Binutils) 2.25

2.25 is an old version of the binutils.  We are currently on release 2.31...

  
> /root/gcc-8.2.0-mingw64/lib/gcc/x86_64-w64-mingw32/8.2.0/crtend.o succeeded
> /root/gcc-8.2.0-mingw64/lib/gcc/x86_64-w64-mingw32/8.2.0/crtend.ocollect2:
> error: ld returned 1 exit status


> #2  0x0000000000418160 in ldwrite () at ../../ld/ldwrite.c:590

Sadly that bit of code is very unhelpful:

 if (!bfd_final_link (link_info.output_bfd, &link_info))
    {
      /* If there was an error recorded, print it out.  Otherwise assume
	 an appropriate error message like unknown symbol was printed
	 out.  */

      if (bfd_get_error () != bfd_error_no_error)
	einfo (_("%F%P: final link failed: %E\n"));
      else
	xexit (1);   <==== this is line 590
    }

So the linker is terminating, and hoping that an error message has already
been displayed. :-(


> /usr/lib/gcc/x86_64-w64-mingw32/7.4.0/../../../../x86_64-w64-mingw32/bin/ld: 
> BFD (GNU Binutils) 2.29.1.20171006 assertion fail /cygdrive/i/szsz
> /tmpp/cygwin64/mingw64-x86_64/mingw64-x86_64-binutils-2.29.1.787c9873-1.x86_64
> /src/binutils-gdb/bfd/cofflink.c:265

OK, well that assertion is checking that size of ordinary symbols and
auxillary symbols is the same:

   BFD_ASSERT (symesz == bfd_coff_auxesz (abfd));

Is it possible for you to find these two values ?  I have a theory that
the problem is that one or maybe both of them has not been initialised.
On Linux this probably goes undetected because allocated memory is usually
zeroed, even if not explicitly requested by the program.  But under mingw32
the memory could contain any random value.  That is just a guess however.

Cheers
  Nick

 BFD_ASSERT (symesz == bfd_coff_auxesz (abfd));
Comment 3 Daniel 2019-01-13 13:28:18 UTC
(In reply to Nick Clifton from comment #2)
> (In reply to Daniel from comment #0)
> 
> Hi Daniel,
> 
> > I was trying to link 64-bit MinGW app using crosscompiler on CentOS Linux.
> 
> Is the bionic library very big ?  Mysterious failures like this are often 
> caused by the system running out of resources.  Usually memory or disk space.

BOINC, not bionic :) BTW, sources are at https://github.com/BOINC/boinc . Lib is not big. Machine where I compiled it had 17GB of free HDD space and 64GB RAM. I also did not saw any "killed" errors caused by OOM from make when building. In the past I saw them on Odroid, so I am familiar with them.

> Sadly that bit of code is very unhelpful:
> 
>  if (!bfd_final_link (link_info.output_bfd, &link_info))
> [cut]
> So the linker is terminating, and hoping that an error message has already
> been displayed. :-(

Yes, definitely. It would be good to improve this.

> > /usr/lib/gcc/x86_64-w64-mingw32/7.4.0/../../../../x86_64-w64-mingw32/bin/ld: 
> > BFD (GNU Binutils) 2.29.1.20171006 assertion fail /cygdrive/i/szsz
> > /tmpp/cygwin64/mingw64-x86_64/mingw64-x86_64-binutils-2.29.1.787c9873-1.x86_64
> > /src/binutils-gdb/bfd/cofflink.c:265
> 
> OK, well that assertion is checking that size of ordinary symbols and
> auxillary symbols is the same:
> 
>    BFD_ASSERT (symesz == bfd_coff_auxesz (abfd));

Unfortunately gdb under Cygwin for some reason is not catching this crash, I tried to do it before I logged this issue. I will try to build new binutils on Linux and see what I will find there. 

In the meantime I found another thing. It turned out that configure script generates/modifies files which are then used during MinGW build. In fact I had to edit config.h to disable HAVE_STRCASESTR, but I assumed that someone else broke MinGW build and this was not fixed yet. These generated files came from some older BOINC version.

When I cloned BOINC repo in new dir and started build there, I got working libs. Here are their sizes:

[root@Wojslawice mingw64]# ls -lh /boinc2/820/mingw64/lib/
total 488K
-rw-r--r--. 1 root root 413K Jan 12 23:46 libboinc.a
-rw-r--r--. 1 root root  47K Jan 12 23:46 libboinc_api.a
-rw-r--r--. 1 root root  13K Jan 12 23:46 libboinc_graphics2.a
-rw-r--r--. 1 root root 6.9K Jan 12 23:46 libboinc_opencl.a

And this is for broken build:

[root@Wojslawice mingw64]# ls -lh /root/boinc82/mingw64/lib/
total 6.7M
-rw-r--r--. 1 root root 6.7M Jan  5 19:00 libboinc.a
-rw-r--r--. 1 root root  47K Jan  5 19:00 libboinc_api.a
-rw-r--r--. 1 root root  13K Jan  5 19:00 libboinc_graphics2.a
-rw-r--r--. 1 root root 6.9K Jan  5 19:00 libboinc_opencl.a

At this moment I do not know if bad code was created by gcc, as or ld - I am going to find this, and create small testcase so you would be able to reproduce this. I will either update this issue or log a new one for gcc.

I am attaching broken libs to this issue, so you could check what exactly is wrong with them.
Comment 4 Daniel 2019-01-13 13:29:42 UTC
Created attachment 11538 [details]
Broken BOINC libs
Comment 5 Hannes Domani 2019-12-29 16:38:47 UTC
(In reply to Daniel from comment #4)
> Created attachment 11538 [details]
> Broken BOINC libs

I've just tried to compile & link the example a.cpp from the description, with the attached BOINC libs, and it works for me:

$ g++ -c a.cpp
$ g++ -static -oa.exe a.o -Lmingw64/lib -lboinc_api -lboinc

And the resulting a.exe starts without problem.

So I'm guessing the bug was already fixed.
I used gcc 9.2.0 and binutils 2.33.1 for my tests.