Created attachment 10924 [details] tar file containing C sources, makefile The code in the gold linker that handles relocatable links seems to be having problems when there are a mix of input object files from GCC and Clang (x86_64), in some cases triggering an assert. Given two C source files (contents unimportant, but each should have at least one function), bug can be triggered by running the following commands: gcc -fPIC -g -O2 -c this.c clang -fPIC -g -O2 -c other.c gcc -fuse-ld=gold -fPIC -g -O2 -Wl,-r -nostdlib -no-pie this.o other.o -o rel.o resulting in this assert: ld.gold: internal error in set_info_section, at ../../binutils/gold/output.h:3386 I've reproduced this on linux x86_64 with several versions of clang, gcc, and ld.gold (including top of trunk for each). GCC 7 + clang 3.9 + gold 1.15 is one starting point. Stack trace for the assert looks like: #0 gold::Output_section::set_info_section (this=0x<...> os=0x<...> at ../../binutils/gold/output.h:3382 #1 0x<...> in gold::Layout::layout_reloc<64, false> (this=0x<...> object=0x<...> shdr=..., data_section=0x<...> rr=0x<...> at ../../binutils/gold/layout.cc:1354 #2 0x<...> in gold::Sized_relobj_file<64, false>::do_layout (this=0x<...> symtab=0x<...> layout=0x<...> sd=0x<...> at ../../binutils/gold/object.cc:1856 #3 0x<...> in gold::Object::layout (this=0x<...> symtab=0x<...> layout=0x<...> sd=0x<...> at ../../binutils/gold/object.h:651 #4 0x<...> in gold::Add_symbols::run (this=0x<...> at ../../binutils/gold/readsyms.cc:634 #5 0x<...> in gold::Workqueue::find_and_run_task (this=0x<...> thread_number=0) at ../../binutils/gold/workqueue.cc:319 #6 0x<...> in gold::Workqueue::process (this=0x<...> thread_number=0) at ../../binutils/gold/workqueue.cc:495 #7 0x<...> in main (argc=6, argv=0x<...> at ../../binutils/gold/main.cc:252 The relocation section being laid out at the point of the assert is one of the .rela.eh_frame sections. When I look at the gcc-compiled objects feeding into the link, I see .eh_frame sections that look like: [ 7] .eh_frame PROGBITS 0000000000000000 000088 000038 00 0 0 8 [0000000000000002]: ALLOC [ 8] .rela.eh_frame RELA 0000000000000000 0001e0 000018 18 9 7 8 [0000000000000040]: INFO LINK The corresponding sections from LLVM-backend compiled objects look like [16] .eh_frame X86_64_UNWIND 0000000000000000 0001e8 000030 00 0 0 8 [0000000000000002]: ALLOC [17] .rela.eh_frame RELA 0000000000000000 000470 000018 18 20 16 8 [0000000000000000]: Note the section types -- this seems to be the crux of the problem. LLVM (as of 2013 or so) sets the section type to X86_64_UNWIND, not PROGBITS. The X86_64 ps-ABI recommends using this section type, but not all producers do (gcc does not). Googling assert in question turns up a few other instances, including: https://stackoverflow.com/questions/47797817/ld-gold-internal-error-in-set-info-section-at-output-h3209 https://sourceware.org/bugzilla/show_bug.cgi?id=15861 however it's not clear whether this is the same bug (some of the instances out in the wild seem to involve the use of a linker script). I spent a while in the debugger looking at what is happening. When gold processes the two different .eh_frame sections, it effectively treats them as distinct sections in the output file (since output section lookup is based on name, type, and flags). In the case of the .rela.eh_frame sections, however, we get only a single output section (since both input ".rela.eh_frame" sections have the same name/type/flags). The assert is happening when Layout::layout_reloc is called on the second relocation section. The code looks like: gold_assert((this->info_section_ == NULL || (this->info_section_ == os && this->info_uses_section_index_)) && this->info_symndx_ == NULL && this->info_ == 0); Here "info_section_" is already pointing to the first output section for .eh_frame, however we're trying to set it to the second output section. I have been experimenting with fixes. I note that for non-relocatable links, gold is already effectively merging PROGBITS + X86_64_UNWIND .eh_frame sections into a single merged PROGBITS section (see the "FIXME" comment in Layout::make_eh_frame_section). With this in mind, a similar way to solve the problem would be to do the same merging for relocatable links. A patch to do this appears below; this fixes the assert. A second possibility would be to change things so that the two separate "flavors" of .eh_frame section are preserved in the output object for relocatable links, meaning that there would have to be two separate relocation sections as well. This would mean special handling of some sort when creating output sections for .rela.eh_frame sections (e.g. not just calling Layout::choose_output_section). I am attaching a reproducer; unpack the tar file and run "make" (may need to edit the makefile so that correct gcc and clang paths are picked up). Ex: $ make THISCC=gcc-7 OTHERCC=clang-3.9 all gcc-7 -fPIC -g -O2 -c this.c clang-3.9 -fPIC -g -O2 -c other.c gcc-7 -fuse-ld=gold -fPIC -g -O2 -Wl,-r -nostdlib -no-pie this.o other.o -o rel.o /usr/bin/ld.gold: internal error in set_info_section, at ../../gold/output.h:3386 collect2: error: ld returned 1 exit status makefile:19: recipe for target 'rel.o' failed make: *** [rel.o] Error 1 $ ----- Tentative patch: diff --git a/gold/layout.cc b/gold/layout.cc index f5fe805ea..987e12092 100644 --- a/gold/layout.cc +++ b/gold/layout.cc @@ -829,6 +829,13 @@ Layout::get_output_section(const char* name, Stringpool::Key name_key, || lookup_type == elfcpp::SHT_PREINIT_ARRAY) lookup_type = elfcpp::SHT_PROGBITS; + // Some compilers assign a type of SHT_PROGBITS to the .eh_frame + // section, while others use the psABI-recommended SHT_X86_64_UNWIND. + // Insure that we combine the two in the resulting output file (mainly + // an issue for relocatable links). + if (lookup_type == elfcpp::SHT_X86_64_UNWIND) + lookup_type = elfcpp::SHT_PROGBITS; + elfcpp::Elf_Xword lookup_flags = flags; // Ignoring SHF_WRITE and SHF_EXECINSTR here means that we combine
After discussion with colleagues, I'm now opting in favor of the second proposal (being more selective about merging relocation sections). Sent a patch to the mailing list: https://sourceware.org/ml/binutils/2018-03/msg00416.html.
The master branch has been updated by Cary Coutant <ccoutant@sourceware.org>: https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;h=bce5a025d2ed7eda2c5bbb85bd9b33333ca5d556 commit bce5a025d2ed7eda2c5bbb85bd9b33333ca5d556 Author: Cary Coutant <ccoutant@gmail.com> Date: Mon Apr 2 16:12:10 2018 -0700 Fix problem where mixed section types can cause internal error during a -r link. During a -r (or --emit-relocs) link, if two sections had the same name but different section types, gold would put relocations for both sections into the same relocation section even though the data sections remained separate. For .eh_frame sections, when one section is PROGBITS and another is X86_64_UNWIND, we really should be using the UNWIND section type and combining the sections anyway. For other sections, we should be creating one relocation section for each output data section. gold/ PR gold/23016 * incremental.cc (can_incremental_update): Check for unwind section type. * layout.h (Layout::layout): Add sh_type parameter. * layout.cc (Layout::layout): Likewise. (Layout::layout_reloc): Create new output reloc section if data section does not already have one. (Layout::layout_eh_frame): Check for unwind section type. (Layout::make_eh_frame_section): Use unwind section type for .eh_frame and .eh_frame_hdr. * object.h (Sized_relobj_file::Shdr_write): New typedef. (Sized_relobj_file::layout_section): Add sh_type parameter. (Sized_relobj_file::Deferred_layout::Deferred_layout): Add sh_type parameter. * object.cc (Sized_relobj_file::check_eh_frame_flags): Check for unwind section type. (Sized_relobj_file::layout_section): Add sh_type parameter; pass it to Layout::layout. (Sized_relobj_file::do_layout): Make local copy of sh_type. Force .eh_frame sections to unwind section type. Pass sh_type to layout_section. (Sized_relobj_file<size, big_endian>::do_layout_deferred_sections): Pass sh_type to layout_section. * output.cc (Output_section::Output_section): Initialize reloc_section_. * output.h (Output_section::reloc_section): New method. (Output_section::set_reloc_section): New method. (Output_section::reloc_section_): New data member. * target.h (Target::unwind_section_type): New method. (Target::Target_info::unwind_section_type): New data member. * aarch64.cc (aarch64_info): Add unwind_section_type. * arm.cc (arm_info, arm_nacl_info): Likewise. * i386.cc (i386_info, i386_nacl_info, iamcu_info): Likewise. * mips.cc (mips_info, mips_nacl_info): Likewise. * powerpc.cc (powerpc_info): Likewise. * s390.cc (s390_info): Likewise. * sparc.cc (sparc_info): Likewise. * tilegx.cc (tilegx_info): Likewise. * x86_64.cc (x86_64_info, x86_64_nacl_info): Likewise. * testsuite/Makefile.am (pr23016_1, pr23016_2): New test cases. * testsuite/Makefile.in: Regenerate. * testsuite/testfile.cc: Add unwind_section_type. * testsuite/pr23016_1.sh: New test script. * testsuite/pr23016_1a.s: New source file. * testsuite/pr23016_1b.s: New source file. * testsuite/pr23016_2.sh: New test script. * testsuite/pr23016_2a.s: New source file. * testsuite/pr23016_2b.s: New source file.
Fixed on trunk.
*** Bug 23155 has been marked as a duplicate of this bug. ***