Bug 20070

Summary: LLVM gold plugin(LLVMgold.so) report Unexpected resolution failure on ld when LTO, but pass on gold
Product: binutils Reporter: Steven Shi <steven.shi>
Component: ldAssignee: Not yet assigned to anyone <unassigned>
Status: RESOLVED FIXED    
Severity: critical CC: hjl.tools, igor.venevtsev
Priority: P2    
Version: 2.26   
Target Milestone: 2.27   
Host: Target:
Build: Last reconfirmed:
Attachments: testcase to reproduce LLVMgold.so Unexpected resolution failure on ld
A patch
A patch

Description Steven Shi 2016-05-10 15:23:42 UTC
Created attachment 9249 [details]
testcase to reproduce LLVMgold.so Unexpected resolution failure on ld

I'm enabling clang LTO feature to to improve code size of Uefi standard (http://www.uefi.org/) firmware (https://github.com/tianocore/edk2). My project is in https://github.com/shijunjing/edk2 branch llvm : https://github.com/shijunjing/edk2/tree/llvm.

I find ld in binutils 2.26 cannot co-work correctly with LLVM gold plugin (LLVMgold.so), and the LLVM gold plugin will report unexpected resolution failure as below when ld do the optimization. But this failure will not happen on gold.

Unexpected resolution
UNREACHABLE executed at /home/jshi19/llvm-3.8.0.src/tools/gold/gold-plugin.cpp:679!


Below is the steps on how to setup clang compiler and reproduce the clang LTO link failure:

1. Download and extract the llvm 3.8.0 Pre-Built Binaries from http://www.llvm.org/releases/ (e.g. http://www.llvm.org/releases/3.8.0/clang+llvm-3.8.0-x86_64-linux-gnu-ubuntu-16.04.tar.xz and extract it as ~/clang38).
2. Decompress and rename the debug version LLVM gold plugin in https://github.com/shijunjing/edk2/blob/llvm/BaseTools/Bin/LLVMgold-debug.tar.gz as LLVMgold.so. Copy it to above clang lib folder (e.g. ~/clang38/lib/LLVMgold.so)
3. Copy GNU Binutils 2.26 linker ld to /usr/bin/ld
4. run below clang LTO link command:

~/clang38/bin/clang -o Hello.dll -flto -nostdlib -Wl,-n -Wl,-q -Wl,--gc-sections -Wl,-z,common-page-size=0x40 -Wl,--entry,_ModuleEntryPoint -Wl,-u,_ModuleEntryPoint -Wl,-Map,Hello.map -Wl,-melf_x86_64 -Wl,--oformat=elf64-x86-64 -Wl,--start-group,,@static_library_files.lst -Wl,--end-group

ld 2.26 fail in gold plugin optimization with below output:
steven: getModuleForFile pass
Unexpected resolution
UNREACHABLE executed at /home/jshi19/llvm-3.8.0.src/tools/gold/gold-plugin.cpp:679!
clang-3.8: error: unable to execute command: Aborted (core dumped)
clang-3.8: error: linker command failed due to signal (use -v to see invocation)

5. Copy GNU Binutils 2.26 gold linker ld-new to /usr/bin/ld
6. re-run step 4 clang LTO link command again:

~/clang38/bin/clang -o Hello.dll -flto -nostdlib -Wl,-n -Wl,-q -Wl,--gc-sections -Wl,-z,common-page-size=0x40 -Wl,--entry,_ModuleEntryPoint -Wl,-u,_ModuleEntryPoint -Wl,-Map,Hello.map -Wl,-melf_x86_64 -Wl,--oformat=elf64-x86-64 -Wl,--start-group,,@static_library_files.lst -Wl,--end-group


gold pass gold plugin optimization with below output:
steven: getModuleForFile pass
steven: getModuleForFile pass
steven: getModuleForFile pass
steven: getModuleForFile pass
steven: getModuleForFile pass
steven: getModuleForFile pass
steven: getModuleForFile pass
steven: getModuleForFile pass
steven: getModuleForFile pass
steven: getModuleForFile pass
steven: getModuleForFile pass
steven: getModuleForFile pass
steven: getModuleForFile pass
steven: getModuleForFile pass
steven: getModuleForFile pass
steven: getModuleForFile pass
steven: getModuleForFile pass
steven: getModuleForFile pass
steven: getModuleForFile pass
steven: getModuleForFile pass
steven: getModuleForFile pass
steven: getModuleForFile pass
steven: getModuleForFile pass
steven: getModuleForFile pass
steven: getModuleForFile pass
/usr/bin/ld: internal error in do_layout, at ../../binutils-2.26/gold/object.cc:1819
clang-3.8: error: linker command failed with exit code 1 (use -v to see invocation)
Above gold do_layout issue is a known bug which is not related LLVM LTO, and has been tracked in Bug 20062 https://sourceware.org/bugzilla/show_bug.cgi?id=20062

The "steven: getModuleForFile pass" is debug message I especially added in the debug version LLVMgold.so, which is line 905 in llvm-3.8.0.src\tools\gold\gold-plugin.cpp. I have a gold-plugin.cpp copy in this folder.



This folder example libraries are abstracted from Uefi Hello module. You can build them by yourself from https://github.com/shijunjing/edk2/tree/llvm

0. Download and extract the llvm 3.8.0 Pre-Built Binaries from http://www.llvm.org/releases/ (e.g. http://www.llvm.org/releases/3.8.0/clang+llvm-3.8.0-x86_64-linux-gnu-ubuntu-16.04.tar.xz and extract it as ~/clang38).
0. Decompress and rename the debug version LLVM gold plugin in https://github.com/shijunjing/edk2/blob/llvm/BaseTools/Bin/LLVMgold-debug.tar.gz as LLVMgold.so. Copy it to above clang lib folder (e.g. ~/clang38/lib/LLVMgold.so)
1. Setup EDK2 build environment as steps in the link: https://github.com/tianocore/tianocore.github.io/wiki/Using-EDK-II-with-Native-GCC 
2. git clone https://github.com/shijunjing/edk2 (e.g. ~/edk2)
3. $ cd edk2
4. $ git checkout llvm
5. $ export CLANG38_BIN=path/to/your/clang38/ (e.g. export CLANG38_BIN=~/clang38/bin/)
6. $ source edksetup.sh
7. $ make -C BaseTools/Source/C
8. Comment out the line 33, 34 in ~/edk2/AppPkg/Applications/Hello/Hello.c, which will trigger significant LTO optimization. I have a updated Hello.c copy in this folder.
9. $ build -t CLANGLTO38 -a X64 -p AppPkg/AppPkg.dsc -m AppPkg/Applications/Hello/Hello.inf

After build, you can find all intermediate files in below folder
~/edk2/Build/AppPkg/DEBUG_CLANGLTO38/X64/AppPkg/Applications/Hello/Hello
Comment 1 H.J. Lu 2016-05-11 22:50:34 UTC
Created attachment 9253 [details]
A patch

Please try this.
Comment 2 Steven Shi 2016-05-12 04:45:22 UTC
Hi H.J.
Your patch works for me. Thank you!
Comment 3 Sourceware Commits 2016-05-12 23:52:06 UTC
The master branch has been updated by H.J. Lu <hjl@sourceware.org>:

https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;h=3355cb3b643bd50aafae768e7cf990d4bec40fe1

commit 3355cb3b643bd50aafae768e7cf990d4bec40fe1
Author: H.J. Lu <hjl.tools@gmail.com>
Date:   Thu May 12 16:50:34 2016 -0700

    Handle symbols defined/referenced only within IR
    
    The plugin is called to claim symbols in an archive element from
    plugin_object_p.  But those symbols aren't needed to create output.
    They are defined and referenced only within IR.  get_symbols should
    return resolution based on IR symbol kinds.
    
    	PR ld/20070
    	* Makefile.am (noinst_LTLIBRARIES): Add libldtestplug4.la.
    	(libldtestplug4_la_SOURCES): New.
    	(libldtestplug4_la_CFLAGS): Likewise.
    	(libldtestplug4_la_LDFLAGS): Likewise.
    	* Makefile.in: Regenerated.
    	* plugin.c (get_symbols): Return resolution based on IR symbol
    	kinds for symbols defined/referenced only within IR.
    	* testplug4.c: New file.
    	* ld/testsuite/ld-plugin/pr20070.d: Likewise.
    	* ld/testsuite/ld-plugin/pr20070a.c: Likewise.
    	* ld/testsuite/ld-plugin/pr20070b.c: Likewise.
    	* testsuite/ld-plugin/plugin.exp (plugin4_name): New.
    	(plugin4_path): Likewise.
    	Add a test for ld/20070.
Comment 4 H.J. Lu 2016-05-12 23:52:50 UTC
Fixed for 2.27.
Comment 5 H.J. Lu 2016-05-13 12:02:56 UTC
*** Bug 20090 has been marked as a duplicate of this bug. ***
Comment 6 Steven Shi 2016-05-16 23:53:11 UTC
Hello,
You know I’m debugging and I hope to disable all optimizations in the LTO firstly, then enable them one by one later. But when I try to disable the optimization by enforcing –O0 in the LTO build, I find the ld fails to recognize some clang  bitcode library, and fail to link. 

e.g. use the Clang_LTO_Fails_On_LD example in below bug attachment
https://sourceware.org/bugzilla/show_bug.cgi?id=20070

If I enforce the –O0 to disable the optimization in LTO, the ld fail to link:
~/clang38/bin/clang -o Hello.dll -flto -O0 -nostdlib -Wl,-n -Wl,-q -Wl,--gc-sections -Wl,-z,common-page-size=0x40 -Wl,--entry,_ModuleEntryPoint -Wl,-u,_ModuleEntryPoint -Wl,-Map,Hello.map -Wl,-melf_x86_64 -Wl,--oformat=elf64-x86-64 -Wl,--start-group,,@static_library_files.lst -Wl,--end-group
BaseLib.lib: error adding symbols: File format not recognized
clang-3.8: error: linker command failed with exit code 1 (use -v to see invocation)

But if I enable the –O1,  the ld  link pass:
~/clang38/bin/clang -o Hello.dll -flto –O1 -nostdlib -Wl,-n -Wl,-q -Wl,--gc-sections -Wl,-z,common-page-size=0x40 -Wl,--entry,_ModuleEntryPoint -Wl,-u,_ModuleEntryPoint -Wl,-Map,Hello.map -Wl,-melf_x86_64 -Wl,--oformat=elf64-x86-64 -Wl,--start-group,,@static_library_files.lst -Wl,--end-group

So, I cannot correctly disable the the optimization in ld LTO. Please help to fix it. Thanks!
Comment 7 H.J. Lu 2016-05-17 00:22:31 UTC
Created attachment 9264 [details]
A patch

Try this.  But this doesn't fix llvm LTO bug. I got

/tmp/lto-llvm-2a2442.o: In function `UefiDevicePathLibDuplicateDevicePath':
/home/jshi19/edk2-fork/MdePkg/Library/UefiDevicePathLib/DevicePathUtilities.c:394: undefined reference to `GetDevicePathSize'
/tmp/lto-llvm-2a2442.o: In function `UefiDevicePathLibAppendDevicePath':
/home/jshi19/edk2-fork/MdePkg/Library/UefiDevicePathLib/DevicePathUtilities.c:447: undefined reference to `DuplicateDevicePath'
/home/jshi19/edk2-fork/MdePkg/Library/UefiDevicePathLib/DevicePathUtilities.c:451: undefined reference to `DuplicateDevicePath'
/home/jshi19/edk2-fork/MdePkg/Library/UefiDevicePathLib/DevicePathUtilities.c:462: undefined reference to `GetDevicePathSize'
/home/jshi19/edk2-fork/MdePkg/Library/UefiDevicePathLib/DevicePathUtilities.c:463: undefined reference to `GetDevicePathSize'
/tmp/lto-llvm-2a2442.o: In function `UefiDevicePathLibAppendDevicePathNode':
/home/jshi19/edk2-fork/MdePkg/Library/UefiDevicePathLib/DevicePathUtilities.c:522: undefined reference to `DuplicateDevicePath'
/home/jshi19/edk2-fork/MdePkg/Library/UefiDevicePathLib/DevicePathUtilities.c:542: undefined reference to `AppendDevicePath'
/tmp/lto-llvm-2a2442.o: In function `UefiDevicePathLibAppendDevicePathInstance':
/home/jshi19/edk2-fork/MdePkg/Library/UefiDevicePathLib/DevicePathUtilities.c:585: undefined reference to `DuplicateDevicePath'
/home/jshi19/edk2-fork/MdePkg/Library/UefiDevicePathLib/DevicePathUtilities.c:596: undefined reference to `GetDevicePathSize'
/home/jshi19/edk2-fork/MdePkg/Library/UefiDevicePathLib/DevicePathUtilities.c:597: undefined reference to `GetDevicePathSize'
/tmp/lto-llvm-2a2442.o: In function `UefiDevicePathLibGetNextDevicePathInstance':
/home/jshi19/edk2-fork/MdePkg/Library/UefiDevicePathLib/DevicePathUtilities.c:684: undefined reference to `DuplicateDevicePath'
/tmp/lto-llvm-2a2442.o: In function `FileDevicePath':
/home/jshi19/edk2-fork/MdePkg/Library/UefiDevicePathLib/DevicePathUtilities.c:872: undefined reference to `AppendDevicePath'
/tmp/lto-llvm-2a2442.o: In function `CatVSPrint':
/home/jshi19/edk2-fork/MdePkg/Library/UefiDevicePathLib/DevicePathUtilities.c:876: undefined reference to `StrCpyS'
BasePrintLib.lib(PrintLib.obj): In function `UnicodeVSPrint':
/home/jshi19/edk2-fork/MdePkg/Library/BasePrintLib/PrintLib.c:71: undefined reference to `DebugAssertEnabled'
/home/jshi19/edk2-fork/MdePkg/Library/BasePrintLib/PrintLib.c:71: undefined reference to `DebugAssert'
/home/jshi19/edk2-fork/MdePkg/Library/BasePrintLib/PrintLib.c:72: undefined reference to `DebugAssert'
BasePrintLib.lib(PrintLib.obj): In function `UnicodeVSPrintAsciiFormat':
/home/jshi19/edk2-fork/MdePkg/Library/BasePrintLib/PrintLib.c:218: undefined reference to `DebugAssertEnabled'
/home/jshi19/edk2-fork/MdePkg/Library/BasePrintLib/PrintLib.c:218: undefined reference to `DebugAssert'
BasePrintLib.lib(PrintLib.obj): In function `SPrintLength':
/home/jshi19/edk2-fork/MdePkg/Library/BasePrintLib/PrintLib.c:730: undefined reference to `DebugAssertEnabled'
/home/jshi19/edk2-fork/MdePkg/Library/BasePrintLib/PrintLib.c:730: undefined reference to `DebugAssert'
/home/jshi19/edk2-fork/MdePkg/Library/BasePrintLib/PrintLib.c:731: undefined reference to `DebugAssert'
BasePrintLib.lib(PrintLibInternal.obj): In function `BasePrintLibSPrintMarker':
/home/jshi19/edk2-fork/MdePkg/Library/BasePrintLib/PrintLibInternal.c:366: undefined reference to `DebugAssertEnabled'
/home/jshi19/edk2-fork/MdePkg/Library/BasePrintLib/PrintLibInternal.c:366: undefined reference to `DebugAssert'
/home/jshi19/edk2-fork/MdePkg/Library/BasePrintLib/PrintLibInternal.c:397: undefined reference to `DebugAssertEnabled'
/home/jshi19/edk2-fork/MdePkg/Library/BasePrintLib/PrintLibInternal.c:405: undefined reference to `AsciiStrSize'
/home/jshi19/edk2-fork/MdePkg/Library/BasePrintLib/PrintLibInternal.c:405: undefined reference to `DebugAssert'
/home/jshi19/edk2-fork/MdePkg/Library/BasePrintLib/PrintLibInternal.c:397: undefined reference to `StrSize'
/home/jshi19/edk2-fork/MdePkg/Library/BasePrintLib/PrintLibInternal.c:397: undefined reference to `DebugAssert'
/home/jshi19/edk2-fork/MdePkg/Library/BasePrintLib/PrintLibInternal.c:701: undefined reference to `ReadUnaligned32'
/home/jshi19/edk2-fork/MdePkg/Library/BasePrintLib/PrintLibInternal.c:702: undefined reference to `ReadUnaligned16'
BasePrintLib.lib(PrintLibInternal.obj): In function `BasePrintLibValueToString':
/home/jshi19/edk2-fork/MdePkg/Library/BasePrintLib/PrintLibInternal.c:129: undefined reference to `DivU64x32Remainder'
BasePrintLib.lib(PrintLibInternal.obj): In function `BasePrintLibSPrintMarker':
/home/jshi19/edk2-fork/MdePkg/Library/BasePrintLib/PrintLibInternal.c:397: undefined reference to `DebugAssertEnabled'
/home/jshi19/edk2-fork/MdePkg/Library/BasePrintLib/PrintLibInternal.c:973: undefined reference to `DebugAssert'
/home/jshi19/edk2-fork/MdePkg/Library/BasePrintLib/PrintLibInternal.c:982: undefined reference to `StrSize'
/home/jshi19/edk2-fork/MdePkg/Library/BasePrintLib/PrintLibInternal.c:982: undefined reference to `DebugAssert'
/home/jshi19/edk2-fork/MdePkg/Library/BasePrintLib/PrintLibInternal.c:987: undefined reference to `AsciiStrSize'
/home/jshi19/edk2-fork/MdePkg/Library/BasePrintLib/PrintLibInternal.c:987: undefined reference to `DebugAssert'
clang-3.9: error: linker command failed with exit code 1 (use -v to see invocation)
Comment 8 Steven Shi 2016-05-17 01:25:12 UTC
Hi H.J.
Thank you for the quick fix. How can I know what different optimization used by the ld between -O0 and -O1? Is the compiler domain optimization (e.g. clang/llvm) or the linker (e.g. ld) domain optimization make the difference? How can I enable or disable these specific optimization besides using -O0 or -O1?
Comment 9 H.J. Lu 2016-05-17 12:47:56 UTC
(In reply to Steven Shi from comment #8)
> Hi H.J.
> Thank you for the quick fix. How can I know what different optimization used
> by the ld between -O0 and -O1? Is the compiler domain optimization (e.g.
> clang/llvm) or the linker (e.g. ld) domain optimization make the difference?
> How can I enable or disable these specific optimization besides using -O0 or
> -O1?

-O0 and -O1 are for llvm, not for ld.  This LTO bug is in llvm.
Comment 10 Steven Shi 2016-05-18 07:22:52 UTC
LLVM guys say "LTO is linker specific, clang is only forwarding the option to the linker here." see below discussion about "How to debug if LTO generate wrong code?":
http://lists.llvm.org/pipermail/cfe-dev/2016-May/048906.html

So it looks, if I compile my llvm bitcode without any optimization level (-O0) and pass them to linker to finish the LTO build, all the optimizations are done only by the linker, not llvm.

Is it right?
Comment 11 H.J. Lu 2016-05-18 12:12:10 UTC
(In reply to Steven Shi from comment #10)
> LLVM guys say "LTO is linker specific, clang is only forwarding the option
> to the linker here." see below discussion about "How to debug if LTO
> generate wrong code?":
> http://lists.llvm.org/pipermail/cfe-dev/2016-May/048906.html
> 
> So it looks, if I compile my llvm bitcode without any optimization level
> (-O0) and pass them to linker to finish the LTO build, all the optimizations
> are done only by the linker, not llvm.
> 
> Is it right?

LTO stands for link-time optimization.  -On is passed back to llvm LTO
plugin.
Comment 12 Sourceware Commits 2020-04-07 14:53:54 UTC
The master branch has been updated by Rainer Orth <ro@sourceware.org>:

https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;h=3e97ba7d583055bdd5439dd300c59a2f5bc02476

commit 3e97ba7d583055bdd5439dd300c59a2f5bc02476
Author: Rainer Orth <ro@CeBiTec.Uni-Bielefeld.DE>
Date:   Tue Apr 7 16:52:03 2020 +0200

    ld: Fix several 32-bit SPARC plugin tests
    
    Several ld plugin tests currently FAIL on 32-bit Solaris/SPARC:
    
    FAIL: load plugin with source
    FAIL: plugin claimfile lost symbol with source
    FAIL: plugin claimfile replace symbol with source
    FAIL: plugin claimfile resolve symbol with source
    FAIL: plugin claimfile replace file with source
    FAIL: plugin set symbol visibility with source
    FAIL: plugin ignore lib with source
    FAIL: plugin claimfile replace lib with source
    FAIL: plugin 2 with source lib
    FAIL: load plugin 2 with source
    FAIL: load plugin 2 with source and -r
    FAIL: plugin 3 with source lib
    FAIL: load plugin 3 with source
    FAIL: load plugin 3 with source and -r
    FAIL: PR ld/20070
    
    all of them in the same way:
    
    ./ld-new: BFD (GNU Binutils) 2.34.50.20200328 internal error, aborting at /vol/src/gnu/binutils/hg/master/git/bfd/elf32-sparc.c:154 in sparc_final_write_processing
    
    This happens when bfd_get_mach returns 0 when abfd refers to a source
    file:
    
    $11 = {
      filename = 0x28c358 "/vol/src/gnu/binutils/hg/master/local/ld/testsuite/ld-plugin/func.c (symbol from plugin)", xvec = 0x24ed6c <sparc_elf32_sol2_vec>,
    [...]
    
    While I could find no specification what abfd's are allowed/expected in
    *_final_write_processing, I could find no other target that behaved the
    same.  And indeed ignoring the 0 case fixes the failures.  The code now
    errors for other values.  64-bit SPARC is not affected because it doesn't
    have a specific implementation of elf_backend_final_write_processing.
    
    Tested on sparc-sun-solaris2.11.
    
    2020-04-07  Rainer Orth  <ro@CeBiTec.Uni-Bielefeld.DE>
                Nick Clifton  <nickc@redhat.com>
    
            * elf32-sparc.c (sparc_final_write_processing): Fix whitespace.
            <0>: Ignore.
            <default>: Error rather than abort.
Comment 13 Alan Modra 2024-02-29 23:09:39 UTC
As per comment #4.  Comment #7 patch was committed too.