Bug 28879 - [2.38 Regression] ld.bfd: possibly incorrect "undefined reference" errors
Summary: [2.38 Regression] ld.bfd: possibly incorrect "undefined reference" errors
Status: RESOLVED FIXED
Alias: None
Product: binutils
Classification: Unclassified
Component: ld (show other bugs)
Version: 2.39
: P2 normal
Target Milestone: ---
Assignee: H.J. Lu
URL:
Keywords:
Depends on:
Blocks: 28264
  Show dependency treegraph
 
Reported: 2022-02-10 16:27 UTC by Evangelos Foutras
Modified: 2022-02-16 13:16 UTC (History)
3 users (show)

See Also:
Host:
Target:
Build:
Last reconfirmed: 2022-02-11 00:00:00


Attachments
Small example with pre-built objects (2.21 MB, application/x-xz)
2022-02-10 16:27 UTC, Evangelos Foutras
Details
Reproducer with source build (1.41 MB, application/x-xz)
2022-02-11 16:50 UTC, Evangelos Foutras
Details
Build log from the source build of libheif (5.18 KB, text/x-log)
2022-02-11 16:51 UTC, Evangelos Foutras
Details
A testcase (763 bytes, application/octet-stream)
2022-02-11 22:12 UTC, H.J. Lu
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Evangelos Foutras 2022-02-10 16:27:44 UTC
Created attachment 13969 [details]
Small example with pre-built objects

The attached tarball contains a small example of an issue I'm seeing on Arch Linux after upgrading to binutils 2.38. It's a trimmed down version of the linking errors I get when trying to build libheif [1] with CXXFLAGS="-flto -Wp,-D_GLIBCXX_ASSERTIONS" LDFLAGS="-Wl,--as-needed". I was unable to repro with a few minimal examples I tried, so please excuse the binary test case.

Bisect pointed to commit 7de7786bb7db [2] as the first revision that exhibits linking failures when building libheif with LTO. Adding "libx265.so.199" to end of the g++ line in repro.sh allows it to link, so there appears to be some issue with looking up some libstdc++ symbols used by libx265 when it's linked indirectly through libheif.so.

(Please note that libheif and libx265 are used as an example here; they were just the first packages that showed this issue after upgrading the toolchain to gcc 11.2, glibc 2.35 and binutils 2.38.)

FWIW, searching for the first symbol in the attached object files I see:

-----------------------------------
heif_info-heif_info.o (readelf --lto-syms):

  _ZNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEE9_M_createERmm    WEAKDEF     DEFAULT

libheif.so (readelf -Ws):

  FUNC    LOCAL  DEFAULT   13 _ZNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEE9_M_createERmm

libx265.so.199 (readelf -Ws):

  GLOBAL DEFAULT  UND _ZNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEE9_M_createERmm@GLIBCXX_3.4.21
-----------------------------------

My linker knowledge is limited, but perhaps you might be able to make sense of all this. :)

[1] https://github.com/strukturag/libheif
[2] https://github.com/bminor/binutils-gdb/commit/7de7786bb7db

==================================
$ cat repro.sh
#!/bin/bash
# objects below built w/ CXXFLAGS="-flto -Wp,-D_GLIBCXX_ASSERTIONS" LDFLAGS="-Wl,--as-needed"
LD_LIBRARY_PATH=$PWD g++ heif_info-heif_info.o libheif.so

$ ./repro.sh
/usr/bin/ld: /../heif-link-failure/libx265.so.199: undefined reference to `std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::_M_create(unsigned long&, unsigned long)@GLIBCXX_3.4.21'
/usr/bin/ld: /../heif-link-failure/libx265.so.199: undefined reference to `std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::_M_mutate(unsigned long, unsigned long, char const*, unsigned long)@GLIBCXX_3.4.21'
/usr/bin/ld: /../heif-link-failure/libx265.so.199: undefined reference to `std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::~basic_string()@GLIBCXX_3.4.21'
/usr/bin/ld: /../heif-link-failure/libx265.so.199: undefined reference to `std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::_M_replace(unsigned long, unsigned long, char const*, unsigned long)@GLIBCXX_3.4.21'
==================================
Comment 1 H.J. Lu 2022-02-11 13:17:14 UTC
I am using GCC 11.2 and got

g++ -o x heif_info-heif_info.o libheif.so -Wl,-R,.
lto1: fatal error: bytecode stream in file ‘heif_info-heif_info.o’ generated with LTO version 11.0 instead of the expected 11.2
compilation terminated.
lto-wrapper: fatal error: g++ returned 1 exit status

Please compile heif_info-heif_info.o with GCC 11.2.
Comment 2 H.J. Lu 2022-02-11 13:49:32 UTC
This is caused by

7de7786bb7db5159fc8a7bfa3df72381ff16a38c is the first bad commit
commit 7de7786bb7db5159fc8a7bfa3df72381ff16a38c
Author: H.J. Lu <hjl.tools@gmail.com>
Date:   Thu Aug 26 07:43:23 2021 -0700

    ld: Change indirect symbol from IR to undefined
Comment 3 H.J. Lu 2022-02-11 14:09:14 UTC
I am testing this:

diff --git a/bfd/elflink.c b/bfd/elflink.c
index 6fa18d92007..a231bdabd28 100644
--- a/bfd/elflink.c
+++ b/bfd/elflink.c
@@ -1295,7 +1295,9 @@ _bfd_elf_merge_symbol (bfd *abfd,
     hi->root.non_ir_ref_dynamic = true;
   }
 
-      if ((oldbfd->flags & BFD_PLUGIN) != 0
+      if (!h->root.non_ir_ref_dynamic
+    && !h->root.non_ir_ref_regular
+    && (oldbfd->flags & BFD_PLUGIN) != 0
     && hi->root.type == bfd_link_hash_indirect)
   {
     /* Change indirect symbol from IR to undefined.  */
Comment 4 Evangelos Foutras 2022-02-11 15:00:41 UTC
Thank you for looking into this. :)

I applied your diff on top of binutils 2.38 and was able to successfully build libheif and nextcloud-client with it. Previously, these two Arch Linux packages (and probably a lot more) would fail to link to several system libraries (x265, de265, snappy).

PS: Not sure if this is of any importance, but I also noticed the following linker error when building nextcloud-client with unpatched binutils 2.38 (likely has the same root cause and fix as the undefined references seen earlier, and this error is gone as well after applying your patch):

  /usr/bin/ld: /usr/lib/libQt5Gui.so.5.15.2: unexpected redefinition of indirect versioned symbol `_ZTI11QSharedData@Qt_5'
Comment 5 H.J. Lu 2022-02-11 15:45:40 UTC
It may be a GCC 11.1 bug.  Please upload the preprocessed heif_info-heif_info.cc
with the command-line options used to compile heif_info-heif_info.o.
Comment 6 Evangelos Foutras 2022-02-11 16:03:25 UTC
(In reply to H.J. Lu from comment #5)
> It may be a GCC 11.1 bug.

Are you referring to the error about _ZTI11QSharedData@Qt_5 in my last comment, or the original issue? The system libQt5Gui was indeed built with GCC 11.1 but the error came from building nextcloud-client with GCC 11.2 and binutils 2.38. It also went away after applying your earlier diff.

If you meant the original issue:

I'm using GCC 11.2. I grabbed heif_info-heif_info.o from the failed libheif build so generating its proproccessed source counterpart isn't very straightforward for me.

However, I can provide a repro.sh that builds libheif from source and reproduces the undefined references on GCC 11.2 + binutils 2.38; would that help?
Comment 7 H.J. Lu 2022-02-11 16:06:08 UTC
(In reply to Evangelos Foutras from comment #6)
> 
> However, I can provide a repro.sh that builds libheif from source and
> reproduces the undefined references on GCC 11.2 + binutils 2.38; would that
> help?

Yes.
Comment 8 Evangelos Foutras 2022-02-11 16:50:14 UTC
Created attachment 13972 [details]
Reproducer with source build

Sorry this took a while, I tried to make it use the bundled x265 library which I know reproduces the issue. Hopefully it repros on your system too. :)
Comment 9 Evangelos Foutras 2022-02-11 16:51:12 UTC
Created attachment 13973 [details]
Build log from the source build of libheif
Comment 10 H.J. Lu 2022-02-11 22:12:14 UTC
Created attachment 13974 [details]
A testcase

[hjl@gnu-tgl-3 pr28879]$ make
g++ -D_GLIBCXX_ASSERTIONS -flto   -c -o pr28879c.o pr28879c.cc
g++ -fPIC   -c -o pr28879b.o pr28879b.cc
g++ -fPIC   -c -o pr28879a.o pr28879a.cc
g++ -Wl,--no-demangle -shared -o libpr28879a.so pr28879a.o
g++ -Wl,--no-demangle -shared -o libpr28879b.so pr28879b.o libpr28879a.so
g++ -Wl,--no-demangle -o x pr28879c.o libpr28879b.so -Wl,-R,.
/usr/local/bin/ld: ./libpr28879a.so: undefined reference to `_ZNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEED1Ev@GLIBCXX_3.4.21'
collect2: error: ld returned 1 exit status
make: *** [Makefile:25: x] Error 1
[hjl@gnu-tgl-3 pr28879]$
Comment 11 H.J. Lu 2022-02-11 23:20:09 UTC
A patch is posted at

https://sourceware.org/pipermail/binutils/2022-February/119740.html
Comment 12 Evangelos Foutras 2022-02-12 07:59:11 UTC
(In reply to H.J. Lu from comment #11)
> A patch is posted at
> 
> https://sourceware.org/pipermail/binutils/2022-February/119740.html

Works great, thanks! :)

(Gave it a quick test by rebuilding the libheif and nextcloud-client packages on Arch.)
Comment 13 Sourceware Commits 2022-02-14 04:37:03 UTC
The master branch has been updated by H.J. Lu <hjl@sourceware.org>:

https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;h=20ea3acc727f3be6322dfbd881e506873535231d

commit 20ea3acc727f3be6322dfbd881e506873535231d
Author: H.J. Lu <hjl.tools@gmail.com>
Date:   Fri Feb 11 15:13:19 2022 -0800

    ld: Keep indirect symbol from IR if referenced from shared object
    
    Don't change indirect symbol defined in IR to undefined if it is
    referenced from shared object.
    
    bfd/
    
            PR ld/28879
            * elflink.c (_bfd_elf_merge_symbol): Don't change indirect
            symbol defined in IR to undefined if it is referenced from
            shared object.
    
    ld/
    
            PR ld/28879
            * testsuite/ld-plugin/lto.exp: Run PR ld/28879 tests.
            * testsuite/ld-plugin/pr28879a.cc: New file.
            * testsuite/ld-plugin/pr28879b.cc: Likewise.
Comment 14 Sourceware Commits 2022-02-16 13:15:18 UTC
The binutils-2_38-branch branch has been updated by H.J. Lu <hjl@sourceware.org>:

https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;h=6aa1b7df2fc435ba1b744f20db5c6d3013496249

commit 6aa1b7df2fc435ba1b744f20db5c6d3013496249
Author: H.J. Lu <hjl.tools@gmail.com>
Date:   Fri Feb 11 15:13:19 2022 -0800

    ld: Keep indirect symbol from IR if referenced from shared object
    
    Don't change indirect symbol defined in IR to undefined if it is
    referenced from shared object.
    
    bfd/
    
            PR ld/28879
            * elflink.c (_bfd_elf_merge_symbol): Don't change indirect
            symbol defined in IR to undefined if it is referenced from
            shared object.
    
    ld/
    
            PR ld/28879
            * testsuite/ld-plugin/lto.exp: Run PR ld/28879 tests.
            * testsuite/ld-plugin/pr28879a.cc: New file.
            * testsuite/ld-plugin/pr28879b.cc: Likewise.
    
    (cherry picked from commit 20ea3acc727f3be6322dfbd881e506873535231d)
Comment 15 H.J. Lu 2022-02-16 13:16:02 UTC
Fixed for 2.39 and on 2.38 branch.