The gold linker segfault at linking if it uses multiple threads. The scenario: 1. I have multiple source files which are compiled with gcc 8.2 with some optimization flags 2. I use cmake to generate the makefile. 3. The linker crashes with segfault: ``` collect2: fatal error: ld terminated with signal 11 [Segmentation fault], core dumped compilation terminated. ``` I have obtained the crash on centos 7.5, ubuntu and debian 9.5. Test case here: https://github.com/alexandrudsc/gold-linker-threads-segfault There is a docker file, based on which you can re-create the environment I used. For a simple test just run ``` bash build-docker-image.sh ``` this will create the docker image, it compile the code and will try to link. But you can start your own container if you want to run this multiple times.
I forgot to mention that on centos and ubuntu I have compiled binutils from sources.
Hi guys, I have reproduced the same bug using gold linker from binutils 2.32. I don't know if there are any news on this, but this is a gentle reminder
I see a segmentation fault trying to link a very simple program using `--threads --thread-count 2`. The full command is: ``` "/usr/bin/ld.gold" -z relro --hash-style=gnu --build-id --eh-frame-hdr -m elf_x86_64 -dynamic-linker /lib64/ld-linux-x86-64.so.2 -o test_1.exe /usr/bin/../lib/gcc/x86_64-linux-gnu/9/../../../x86_64-linux-gnu/crt1.o /usr/bin/../lib/gcc/x86_64-linux-gnu/9/../../../x86_64-linux-gnu/crti.o /usr/bin/../lib/gcc/x86_64-linux-gnu/9/crtbegin.o -L/usr/bin/../lib/gcc/x86_64-linux-gnu/9 -L/usr/bin/../lib/gcc/x86_64-linux-gnu/9/../../../x86_64-linux-gnu -L/usr/bin/../lib/gcc/x86_64-linux-gnu/9/../../../../lib64 -L/lib/x86_64-linux-gnu -L/lib/../lib64 -L/usr/lib/x86_64-linux-gnu -L/usr/lib/../lib64 -L/usr/lib/x86_64-linux-gnu/../../lib64 -L/usr/bin/../lib/gcc/x86_64-linux-gnu/9/../../.. -L/usr/lib/llvm-10/bin/../lib -L/lib -L/usr/lib -plugin /usr/lib/llvm-10/bin/../lib/LLVMgold.so -plugin-opt=mcpu=x86-64 -plugin-opt=thinlto hilbert.o --threads --thread-count 2 -lgcc --as-needed -lgcc_s --no-as-needed -lc -lgcc --as-needed -lgcc_s --no-as-needed /usr/bin/../lib/gcc/x86_64-linux-gnu/9/crtend.o /usr/bin/../lib/gcc/x86_64-linux-gnu/9/../../../x86_64-linux-gnu/crtn.o ``` This is with `GNU gold (GNU Binutils for Ubuntu 2.34) 1.16` on Lubuntu 20.04.
A traceback with `gdb` gives: ``` #0 0x00007ffff3879f40 in llvm::PMTopLevelManager::addImmutablePass(llvm::ImmutablePass*) () from /usr/lib/llvm-10/bin/../lib/../lib/libLLVM-10.so.1 #1 0x00007ffff387991f in llvm::PMTopLevelManager::schedulePass(llvm::Pass*) () from /usr/lib/llvm-10/bin/../lib/../lib/libLLVM-10.so.1 #2 0x00007ffff469a58e in ?? () from /usr/lib/llvm-10/bin/../lib/../lib/libLLVM-10.so.1 #3 0x00007ffff469989c in llvm::lto::backend(llvm::lto::Config const&, std::function<std::unique_ptr<llvm::lto::NativeObjectStream, std::default_delete<llvm::lto::NativeObjectStream> > (unsigned int)>, unsigned int, std::unique_ptr<llvm::Module, std::default_delete<llvm::Module> >, llvm::ModuleSummaryIndex&) () from /usr/lib/llvm-10/bin/../lib/../lib/libLLVM-10.so.1 #4 0x00007ffff4693434 in llvm::lto::LTO::runRegularLTO(std::function<std::unique_ptr<llvm::lto::NativeObjectStream, std::default_delete<llvm::lto::NativeObjectStream> > (unsigned int)>) () from /usr/lib/llvm-10/bin/../lib/../lib/libLLVM-10.so.1 #5 0x00007ffff4692f22 in llvm::lto::LTO::run(std::function<std::unique_ptr<llvm::lto::NativeObjectStream, std::default_delete<llvm::lto::NativeObjectStream> > (unsigned int)>, std::function<std::function<std::unique_ptr<llvm::lto::NativeObjectStream, std::default_delete<llvm::lto::NativeObjectStream> > (unsigned int)> (unsigned int, llvm::StringRef)>) () from /usr/lib/llvm-10/bin/../lib/../lib/libLLVM-10.so.1 #6 0x00007ffff749f4ac in ?? () from /usr/lib/llvm-10/bin/../lib/LLVMgold.so #7 0x000055555567c27f in ?? () #8 0x000055555567c3d0 in ?? () #9 0x00005555556c1b58 in ?? () #10 0x00005555556c1daa in ?? () #11 0x0000555555591ee5 in ?? () #12 0x00007ffff7ba20b3 in __libc_start_main (main=0x5555555918d0, argc=46, argv=0x7fffffffd458, init=<optimized out>, fini=<optimized out>, rtld_fini=<optimized out>, stack_end=0x7fffffffd448) at ../csu/libc-start.c:308 #13 0x0000555555592ace in ?? () ```
I've made a repo with a Makefile that reproduces the bug here: https://github.com/r-barnes/gold_segfault_reproducer The program I'm trying to compile is a simple "Hello, World!". The compilation string is: clang main.c -flto -fuse-ld=gold -Wl,--threads -Wl,--thread-count,4 -v --save-temps I observe the segfault with both clang-8 and clang-10 using `GNU gold (GNU Binutils for Ubuntu 2.34) 1.16` running on Lubuntu 20.04. The segfault is intermittent: not every run will trigger the segfault. Removing `-flto` seems to stop the segfault. Using `-Wl,--thread-count,1` stops the segfault.
For me the simplest reproducer is the following one-liner: """ $ echo 'int main() {}' | x86_64-pc-linux-gnu-gcc -flto -fuse-ld=gold -Wl,--threads -Wl,--thread-count,32 -x c - collect2: fatal error: ld terminated with signal 11 [Segmentation fault], core dumped compilation terminated. """ (gcc-master, binutils-2.35.1, x86_64-pc-linux-gnu target) binutils backtrace: """ (gdb) bt #0 gold::Pluginobj::get_symbol_resolution_info (this=0x7fdc10001010, symtab=0x7ffe9622ef50, nsyms=<optimized out>, syms=<optimized out>, version=<optimized out>) at ../../binutils-2.35.1/gold/plugin.cc:1293 #1 0x00007fdc94747c7a in write_resolution () at /usr/src/debug/sys-devel/gcc-11.0.0_pre9999/gcc-11.0.0_pre9999/lto-plugin/lto-plugin.c:569 #2 all_symbols_read_handler () at /usr/src/debug/sys-devel/gcc-11.0.0_pre9999/gcc-11.0.0_pre9999/lto-plugin/lto-plugin.c:749 #3 0x000055e7fdf1004f in gold::Plugin::all_symbols_read (this=<optimized out>) at ../../binutils-2.35.1/gold/plugin.cc:403 #4 gold::Plugin_manager::all_symbols_read (this=0x55e7fe561360, workqueue=workqueue@entry=0x7ffe9622ec50, task=task@entry=0x55e7fe5bacc0, input_objects=<optimized out>, symtab=<optimized out>, dirpath=<optimized out>, mapfile=0x0, last_blocker=0x55e7fe5bad20) at ../../binutils-2.35.1/gold/plugin.cc:856 #5 0x000055e7fdf1018c in gold::Plugin_hook::run (this=0x55e7fe5bacc0, workqueue=0x7ffe9622ec50) at ../../binutils-2.35.1/gold/plugin.cc:1770 #6 0x000055e7fdf6ba70 in gold::Workqueue::find_and_run_task (this=0x7ffe9622ec50, thread_number=23) at ../../binutils-2.35.1/gold/workqueue.cc:319 #7 0x000055e7fdf6bcca in gold::Workqueue::process (this=0x7ffe9622ec50, thread_number=23) at ../../binutils-2.35.1/gold/workqueue.cc:495 #8 0x000055e7fdf6be23 in gold::Workqueue_threader_threadpool::process (thread_number=<optimized out>, this=<optimized out>) at ../../binutils-2.35.1/gold/workqueue-internal.h:92 #9 gold::Workqueue_thread::thread_body (arg=0x55e7fe5b97d0) at ../../binutils-2.35.1/gold/workqueue-threads.cc:117 #10 0x00007fdc9444be6e in start_thread (arg=0x7fdc3c132640) at pthread_create.c:463 #11 0x00007fdc94381a5f in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95 (gdb) info threads Id Target Id Frame * 1 Thread 0x7fdc3c132640 (LWP 1087079) gold::Pluginobj::get_symbol_resolution_info (this=0x7fdc10001010, symtab=0x7ffe9622ef50, nsyms=<optimized out>, syms=<optimized out>, version=<optimized out>) at ../../binutils-2.35.1/gold/plugin.cc:1293 2 Thread 0x7fdc90147640 (LWP 1087058) futex_wait_cancelable (private=0, expected=0, futex_word=0x55e7fe567d24) at ../sysdeps/nptl/futex-internal.h:183 ... 32 Thread 0x7fdc2812d640 (LWP 1087084) futex_wait_cancelable (private=0, expected=0, futex_word=0x55e7fe567d20) at ../sysdeps/nptl/futex-internal.h:183 """ valgrind says with unexpected access happens at the same location:: """ ==1087267== Thread 30: ==1087267== Invalid read of size 1 ==1087267== at 0x458800: gold::Pluginobj::get_symbol_resolution_info(gold::Symbol_table*, int, ld_plugin_symbol*, int) const (plugin.cc:1295) ==1087267== by 0x484BC79: write_resolution (lto-plugin.c:569) ==1087267== by 0x484BC79: all_symbols_read_handler (lto-plugin.c:749) ==1087267== by 0x45704E: all_symbols_read (plugin.cc:403) ==1087267== by 0x45704E: gold::Plugin_manager::all_symbols_read(gold::Workqueue*, gold::Task*, gold::Input_objects*, gold::Symbol_table*, gold::Dirsearch*, gold::Mapfile*, gold::Task_token**) (plugin.cc:856) ==1087267== by 0x45718B: gold::Plugin_hook::run(gold::Workqueue*) (plugin.cc:1770) ==1087267== by 0x4B2A6F: gold::Workqueue::find_and_run_task(int) (workqueue.cc:319) ==1087267== by 0x4B2CC9: gold::Workqueue::process(int) (workqueue.cc:495) ==1087267== by 0x4B2E22: process (workqueue-internal.h:92) ==1087267== by 0x4B2E22: gold::Workqueue_thread::thread_body(void*) (workqueue-threads.cc:117) ==1087267== by 0x4B42E6D: start_thread (pthread_create.c:463) ==1087267== by 0x4C55A5E: clone (clone.S:95) ==1087267== Address 0x10 is not stack'd, malloc'd or (recently) free'd ==1087267== ==1087267== ==1087267== Process terminating with default action of signal 11 (SIGSEGV): dumping core """
binutils from master does not seem to crash on a simple test from #comment7. Bisected the fix down to https://sourceware.org/git/?p=binutils-gdb.git;a=commitdiff;h=d4820dac5e7608e24fba6d08cde9248b4c4b2928 """ $ git bisect bad d4820dac5e7608e24fba6d08cde9248b4c4b2928 is the first bad commit commit d4820dac5e7608e24fba6d08cde9248b4c4b2928 Author: H.J. Lu <hjl.tools@gmail.com> Date: Sun Nov 8 04:10:01 2020 -0800 gold: Avoid sharing Plugin_list::iterator class Plugin_manager has // A pointer to the current plugin. Used while loading plugins. Plugin_list::iterator current_; The same iterator is shared by all threads. It is OK to use it to load plugins since only one thread loads plugins. Avoid sharing Plugin_list iterator in all other cases. PR gold/26200 * plugin.cc (Plugin_manager::claim_file): Don't share Plugin_list iterator. (Plugin_manager::all_symbols_read): Likewise. (Plugin_manager::cleanup): Likewise. gold/ChangeLog | 8 ++++++++ gold/plugin.cc | 34 +++++++++++++++++----------------- 2 files changed, 25 insertions(+), 17 deletions(-) """ Looks related. Dupe of bug #26200?
Dup. *** This bug has been marked as a duplicate of bug 26200 ***