This is the mail archive of the binutils@sourceware.org mailing list for the binutils project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Strange ld.gold segmentation error issues.


Hi list,

I have seen a few segmentation error from ld.gold in the last few
weeks during linking of mozilla thunderbird mail client.
This did not happen in about 12 months after I began using ld.gold for
linking since last year.

My system:
Linux vm-debian-amd64 3.12-1-amd64 #1 SMP Debian 3.12.9-1 (2014-02-01)
x86_64 GNU/Linux

Debian GNU/Linux 64-bit: I am using testing repository as well as
main repository.

/usr/bin/ld.gold has been upgraded lately in the last couple of months.

/usr/bin/ld.gold --version
GNU gold (GNU Binutils for Debian 2.24.51.20140425) 1.11
Copyright (C) 2014 Free Software Foundation, Inc.
This program is free software; you may redistribute it under the terms of
the GNU General Public License version 3 or (at your option) a later
version.
This program has absolutely no warranty.

I think the cause of the segmentation error may be related to Debian's
binutils 2.24.51.20140425 which has been installed lately.  (It seems
to based on the git snapshot of binutils with a few tweaks for
multi-architecture support under Debian. Nothing stood out that may
affect x86-64 linking when I looked at it, but I could be wrong.)

Here is the information about segfaults from the few observed
errors. recorded by linux kernel.
========================================

messages:May 12 22:40:38 vm-debian-amd64 kernel: [121787.862407]
ld.gold[16371]: segfault at 2b2df5ec1000 ip 00000000004505c0 sp
00007fff2bc98ca0 error 4 in ld.gold[400000+25e000]
messages:May 13 01:22:29 vm-debian-amd64 kernel: [131503.909365]
ld.gold[8245]: segfault at 2b19e40d9000 ip 00000000004505c0 sp
00007fffa53dcc70 error 4 in ld.gold[400000+25e000]
messages:May 17 06:31:38 vm-debian-amd64 kernel: [488306.624773]
ld.gold[10382]: segfault at 2b2a7d625000 ip 00000000004505c0 sp
00007fff33a80d70 error 4 in ld.gold[400000+25e000]
root@vm-debian-amd64:/var/log#

Unfortunately, /usr/bin/ld.gold installed by Debian packaging system
is stripped and so I cannot correlate ip 00000000004505c0 to a
function symbol.

So I installed a binary compiled from debian package source file to
figure out where the error occurs.

ls -ltr /usr/bin/ld*
-rwxr-xr-x 1 root root  2506064 Apr 28 07:11 /usr/bin/ld.gold*
-rwxr-xr-x 1 root root  1047648 Apr 28 07:11 /usr/bin/ld.bfd*
lrwxrwxrwx 1 root root        6 Apr 28 07:11 /usr/bin/ld -> ld.bfd*
-rwxr-xr-x 1 root root     5280 Apr 28 13:01 /usr/bin/ldd*
-rwxr-xr-x 1 root root 68814160 May 18 06:10 /usr/bin/ld-new*
-rwxr-xr-x 1 root root 68814104 May 19 21:15 /usr/bin/ld.new*  <-- This!
ishikawa@vm-debian-amd64:/tmp$

ld.new above is my own version of ld.gold compiled from
source with debug symbols.
I invoked it from my ~/bin/ld script.

After the installation of non-stripped ld.new, and re-compiling
everything for mozilla TB, now I got another segmentation error.

Under /var/log:
grep segfault messages
May 19 21:20:47 vm-debian-amd64 kernel: [714565.973445] ld.new[26462]:
segfault at 2b929377c000 ip 0000000000517240 sp 00007fff8162c0d0 error 4
in ld.new[400000+358000]
root@vm-debian-amd64:/var/log#

>From the nm ld.new | sort -n -k 1 :
0000000000516980 T _ZN4gold9Gdb_indexD2Ev
0000000000517220 T _ZN4gold9Gdb_index10add_symbolEiPKch
0000000000517620 T
_ZN4gold21Gdb_index_info_reader26read_pubnames_and_pubtypesEPNS_9Dwarf_dieE

>From the excerpt of "nm ld.new | sort -n -k 1 | c++filt"
0000000000516940 T gold::Gdb_index::print_stats()
0000000000516980 T gold::Gdb_index::~Gdb_index()
0000000000516980 T gold::Gdb_index::~Gdb_index()
0000000000517220 T gold::Gdb_index::add_symbol(int, char const*,
unsigned char)
0000000000517620 T
gold::Gdb_index_info_reader::read_pubnames_and_pubtypes(gold::Dwarf_die*)

So I think the error (segfault) happened in
gold::Gdb_index::add_symbol(int, char const*, unsigned char).

But the funny thing is that the created libmozalloc.so seems to be
usable by another invocation of build (by top-level make) if I invoke
it again. It did not seem to be necessary to re-create it!
Very strange.
Also, the running of build again after I deleted libmozalloc.so runs
successfully without segmentation error (!).
So it is history sensitive?!

So maybe it could be related to OOM condition (or may be not.).

The error log from mozilla thunderbird build when the above
segmentation fault occurred.

    [...]

mozalloc.o       <--- these are the compilation target.
mozalloc_abort.o
mozalloc_oom.o
/REF-COMM-CENTRAL/comm-central/mozilla/memory/mozalloc/VolatileBufferFallback.cpp:
In member function ‘bool mozilla::VolatileBuffer::Init(size_t, size_t)’:
/REF-COMM-CENTRAL/comm-central/mozilla/memory/mozalloc/VolatileBufferFallback.cpp:33:53:
warning: ignoring return value of ‘int moz_posix_memalign(void**,
size_t, size_t)’, declared with attribute warn_unused_result
[-Wunused-result]
   (void)moz_posix_memalign(&mBuf, aAlignment, aSize);
                                                     ^
libmozalloc.so        <==== this is the target that caused ld
		            segmentation error.: command arguments
			    are listed later.
Segmentation fault
collect2: error: ld returned 139 exit status
/REF-COMM-CENTRAL/comm-central/mozilla/nsprpub/config/rules.mk:298:
recipe for target 'libnspr4.so' failed

That error was triggered by my non-stripped ld.new.
It is referred to incorrectly ld.gold in comment
in my ~/bin/ld script, quoted below, and it was run with the following
arguments when segmentation error occurred.

---- ~/bin/ld
:
#
logger "my version of non-stripped ld.gold called with: $*"
: /usr/bin/ld.gold $*
/usr/bin/ld.new $*
--------

May 19 21:20:47 vm-debian-amd64 ishikawa: my version of non-stripped
ld.gold called with: --sysroot=/ --build-id --eh-frame-hdr -m elf_x86_64
--hash-style=gnu -shared -o libnspr4.so
/usr/lib/gcc/x86_64-linux-gnu/4.8/../../../x86_64-linux-gnu/crti.o
/usr/lib/gcc/x86_64-linux-gnu/4.8/crtbeginS.o
-L/usr/lib/gcc/x86_64-linux-gnu/4.8
-L/usr/lib/gcc/x86_64-linux-gnu/4.8/../../../x86_64-linux-gnu
-L/usr/lib/gcc/x86_64-linux-gnu/4.8/../../../../lib
-L/lib/x86_64-linux-gnu -L/lib/../lib -L/usr/lib/x86_64-linux-gnu
-L/usr/lib/../lib -L/usr/lib/gcc/x86_64-linux-gnu/4.8/../../..
--gdb-index -soname libnspr4.so ./prvrsion.o io/./prfdcach.o
io/./prmwait.o io/./prmapopt.o io/./priometh.o io/./pripv6.o
io/./prlayer.o io/./prlog.o io/./prmmap.o io/./prpolevt.o io/./prprf.o
io/./prscanf.o io/./prstdio.o threads/./prcmon.o threads/./prrwlock.o
threads/./prtpd.o linking/./prlink.o malloc/./prmalloc.o
malloc/./prmem.o md/./prosdep.o memory/./prshm.o memory/./prshma.o
memory/./prseg.o misc/./pralarm.o misc/./pratom.o misc/./prcountr.o
misc/./prdtoa.o misc/./prenv.o misc/./prerr.o misc/./prerror.o
misc/./prerrortable.o misc/./prinit.o misc/./prinrval.o misc/./pripc.o
misc/./prlog2.o misc/./prlong.o misc/./prnetdb.o misc/./praton.o
misc/./prolock.o misc/./prrng.o misc/./prsystem.o misc/./prthinfo.o
misc/./prtpool.o misc/./prtrace.o misc/./prtime.o pthreads/./ptsynch.o
pthreads/./ptio.o pthreads/./ptthread.o pthreads/./ptmisc.o
md/unix/./unix.o md/unix/./unix_errors.o md/unix/./uxproces.o
md/unix/./uxrng.o md/unix/./uxshm.o md/unix/./uxwrap.o md/unix/./linux.o
md/unix/./os_Linux_x86_64.o -z text --build-id -lpthread -ldl -lrt -lgcc
--as-needed -lgcc_s --no-as-needed -lc -lgcc --as-needed -lgcc_s
--no-as-needed /usr/lib/gcc/x86_64-linux-gnu/4.8/crtendS.o
/usr/lib/gcc/x86_64-linux-gnu/4.8/../../../x86_64-linux-gnu/crtn.o
May 19 21:20:47 vm-debian-amd64 kernel: [714565.973445] ld.new[26462]:
segfault at 2b929377c000 ip 0000000000517240 sp 00007fff8162c0d0 error 4
in ld.new[400000+358000]
root@vm-debian-amd64:/var/log# exit

Any ideas to where to look?

Yes, the compilation runs of Mozilla thunderbird  invokes make with
"-j4" or something and so there are processes running in parallel.

I have 9 GiB of memory. (I am running this linux image in vmplayer and
so the amount of memory can be tweaked.)
>From dmesg:
[    0.000000] Memory: 9002568K/9215480K available (4723K kernel code,
679K rwdata, 1596K rodata, 972K init, 944K bss, 212912K reserved)

I have enough swap space, and so running out of VM space is hard to
believe though it can not be ruled out. However, I have been compiling
mozilla Thundrbird for the last 12 months or so with ld.gold until the
upgrade of binutils versions in the last couple of months.

My compiler is gcc-4.8 (fixed although Debian seems
to offer 4.9 now.)

I have looked at binutils mail archive and all I could vaguely
associate with my segmentation error in the last couple of months
postings was got_index being handled as unsigned int which was
incorrect. Not sure if that could be related to this intermittent
error.

That error could not be reproduced by running of make after the target
libmozalloc.so, which caused the segmentation error, was removed is hard
to understand :-(

TIA







Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]