Bug 21074

Summary: [2.28 Regression] bfd ld stumbles over duplicated symbols generated by gold
Product: binutils Reporter: Matthias Klose <doko>
Component: goldAssignee: Cary Coutant <ccoutant>
Status: ASSIGNED ---    
Severity: normal CC: ian, kfunk, mitya57, nickc, tulipawn
Priority: P2    
Version: 2.28   
Target Milestone: ---   
Host: Target:
Build: Last reconfirmed:

Description Matthias Klose 2017-01-23 14:06:34 UTC
[forwarded from https://bugs.debian.org/852035]

"""
When the gold linker is used, it sometimes generates duplicated symbols.
It's not something new, and can be seen for example in
qtbase-opensource-src 5.7.1+dfsg-2:

  1081: 00000000004caf88     0 NOTYPE  GLOBAL DEFAULT ABS __bss_start@Qt_5
  1082: 00000000004caf88     0 NOTYPE  GLOBAL DEFAULT ABS __bss_start@Qt_5

Note that bfd doesn't generate duplicated with the same command line
beside the -fuse-ld=gold. This used to not be a problem with both gold
and bfd. However the upstream commit eb3908448b , which first appeared
in binutils 2.27.51.20161231-1, causes this symbols to become
relocatable. This can be seen in qtbase-opensource-src 5.7.1+dfsg-3:

  1081: 00000000004caf88     0 NOTYPE  GLOBAL DEFAULT 19 __bss_start@Qt_5
  1082: 00000000004caf88     0 NOTYPE  GLOBAL DEFAULT 19 __bss_start@Qt_5

This in turn causes bfd to fail to link binaries against libQt5Core.so
or libQt5Gui.so, for example when building libqapt (see [1] for a full
build log):

  /usr/lib/x86_64-linux-gnu/libQt5Gui.so.5.7.1:(*IND*+0x0): multiple definition of `__bss_start'
  /usr/lib/x86_64-linux-gnu/libQt5Gui.so.5.7.1:(*IND*+0x0): multiple definition of `_edata'
  /usr/lib/x86_64-linux-gnu/libQt5Gui.so.5.7.1:(*IND*+0x0): multiple definition of `_end'
  /usr/lib/x86_64-linux-gnu/libQt5Core.so.5.7.1:(*IND*+0x0): multiple definition of `__bss_start'
  /usr/lib/x86_64-linux-gnu/libQt5Core.so.5.7.1:(*IND*+0x0): multiple definition of `__bss_start'
  /usr/lib/x86_64-linux-gnu/libQt5Core.so.5.7.1:(*IND*+0x0): multiple definition of `_edata'
  /usr/lib/x86_64-linux-gnu/libQt5Core.so.5.7.1:(*IND*+0x0): multiple definition of `_edata'
  /usr/lib/x86_64-linux-gnu/libQt5Core.so.5.7.1:(*IND*+0x0): multiple definition of `_end'
  /usr/lib/x86_64-linux-gnu/libQt5Core.so.5.7.1:(*IND*+0x0): multiple definition of `_end'

Note however that having duplicated symbols doesn't cause a problem to
gold, only to bfd.

I don't really know how this should be fixed, either bfd should accept
duplicated symbols, or gold should stop generating them.
"""
Comment 1 Nick Clifton 2017-01-23 15:39:08 UTC
Hi Matthias,

  Would it be possible for you to upload a small test case that reproduces this problem.  Ideally involving just object files and test libraries so that no compilation or system libraries are involved ?

  As for what to do about the problem, I am still pondering.  The ELF spec says:

   "When the link editor combines several relocatable object files,
    it does not allow multiple definitions of STB_GLOBAL symbols with 
    the same name."

  But, in this particular case what we really have is a repeated definition of the same symbol.  It would be much cleaner, IMHO, if GOLD did not create them in the first place.  But this might be hard to achieve.  (I am not a GOLD expert).  So maybe the bfd linker would be better off just filtering out duplicate symbols.  Hence the desire for a test case to help examine the situation.

Cheers
  Nick
Comment 2 Matthias Klose 2017-01-23 16:50:26 UTC
test case from the package at
https://people.debian.org/~doko/tmp/tst-21074.tar.xz
Comment 3 Matthias Klose 2017-01-23 16:52:48 UTC
and tested that with a binutils with the problematic revision reverted, and rebuilt qt libraries lets the link succeed again.
Comment 4 Cary Coutant 2017-01-23 20:38:12 UTC
>   As for what to do about the problem, I am still pondering.  The ELF spec
> says:
> 
>    "When the link editor combines several relocatable object files,
>     it does not allow multiple definitions of STB_GLOBAL symbols with 
>     the same name."
> 
>   But, in this particular case what we really have is a repeated definition
> of the same symbol.  It would be much cleaner, IMHO, if GOLD did not create
> them in the first place.  But this might be hard to achieve.  (I am not a
> GOLD expert).  So maybe the bfd linker would be better off just filtering
> out duplicate symbols.  Hence the desire for a test case to help examine the
> situation.

Gold should clearly not be generating the duplicate symbols, so I don't think you should change ld to accept them.

However, note that the commit identified is not the one causing the duplicate symbols, but simply one that makes them section-relative rather than absolute. Apparently, ld is fine with duplicate absolute symbols, but it's not fine with duplicate relative symbols. Seems to me you should fix ld so it does diagnose the duplicate symbols in the absolute case. It should at least be consistent.
Comment 5 Cary Coutant 2017-01-23 23:46:09 UTC
(In reply to Matthias Klose from comment #2)
> test case from the package at
> https://people.debian.org/~doko/tmp/tst-21074.tar.xz

Something is missing from this test case, I think. It links fine with gold, and no duplicate symbols are generated. In the problem description, your __bss_start symbols have a Qt_5 version on them, but when I link your test case, I don't see any versions on those symbols. It's probable that versions are the root cause of the problem you're seeing, but I'll need to reproduce the problem to debug it. I'll try to come up with a small test case on my own, but if you can figure out what's missing from this one, it would help.
Comment 6 Nick Clifton 2017-01-24 10:04:34 UTC
Hi Cary,

(In reply to Cary Coutant from comment #5)
> Something is missing from this test case, I think. It links fine with gold,
> and no duplicate symbols are generated.

That's strange - the test case works for me.  (Note- you need to use the bfd
linker to link the test case not gold).

> In the problem description, your
> __bss_start symbols have a Qt_5 version on them, but when I link your test
> case, I don't see any versions on those symbols.

Try:

  $ readelf --syms libapt-pkg.so.5.0  | grep bss_start
  963: 00000000003b14a0     0 NOTYPE  GLOBAL DEFAULT 25 __bss_start@@APTPKG_5.0

I do not know how GOLD resolves these symbol version conflicts.  Maybe it converts them into absolute symbols first ?


> Apparently, ld is fine with duplicate absolute symbols, but it's not fine 
> with duplicate relative symbols. Seems to me you should fix ld so it does 
> diagnose the duplicate symbols in the absolute case. It should at least be 
> consistent.

Actually it is fine with duplicate absolute symbols that *have the same value*.
Which makes sense really.  Such symbols are just copies of each other.  Redundant perhaps, but not really a cause for failing to link. 

Cheers
  Nick
Comment 7 Dmitry Shachnev 2017-01-24 13:47:08 UTC
> In the problem description, your __bss_start symbols have a Qt_5 version on them, but when I link your test case, I don't see any versions on those symbols.

It is worth noting that Qt uses a version script (i.e. -Wl,--version-script,QtCore.version) which looks like this:

Qt_5_PRIVATE_API {
    qt_private_api_tag*;
    /* some more declarations here */
};
Qt_5 { *; };
Qt_5.0 {} Qt_5;
Qt_5.1 {} Qt_5.0;
Qt_5.2 {} Qt_5.1;
Qt_5.3 {} Qt_5.2;
Qt_5.4 {} Qt_5.3;
Qt_5.5 {} Qt_5.4;
Qt_5.6 {} Qt_5.5;
Qt_5.7 { qt_version_tag; } Qt_5.6
Comment 8 Cary Coutant 2017-01-24 20:39:13 UTC
> > Something is missing from this test case, I think. It links fine with gold,
> > and no duplicate symbols are generated.
> 
> That's strange - the test case works for me.  (Note- you need to use the bfd
> linker to link the test case not gold).
> 
> > In the problem description, your
> > __bss_start symbols have a Qt_5 version on them, but when I link your test
> > case, I don't see any versions on those symbols.

Well, in that case, I'm confused. I thought this was a test case to demonstrate the bug in gold, not the one in ld.

Can I get a test case that demonstrates the problem in gold?

> Try:
> 
>   $ readelf --syms libapt-pkg.so.5.0  | grep bss_start
>   963: 00000000003b14a0     0 NOTYPE  GLOBAL DEFAULT 25
> __bss_start@@APTPKG_5.0
> 
> I do not know how GOLD resolves these symbol version conflicts.  Maybe it
> converts them into absolute symbols first ?

Sorry, I don't really know what you're referring to as version conflicts. And I see only one copy of the symbol there. Where are the duplicate symbol messages coming from? This sounds like it may really be just an ld bug.

> Actually it is fine with duplicate absolute symbols that *have the same
> value*.
> Which makes sense really.  Such symbols are just copies of each other. 
> Redundant perhaps, but not really a cause for failing to link. 

Why should duplicate absolute symbols be treated any differently from duplicate relative symbols? The same argument you use that they're just copies of each other applies to relative symbols with the same section and value.

-cary
Comment 9 Cary Coutant 2017-01-24 20:43:24 UTC
> When the gold linker is used, it sometimes generates duplicated symbols.
> It's not something new, and can be seen for example in
> qtbase-opensource-src 5.7.1+dfsg-2:
> 
>   1081: 00000000004caf88     0 NOTYPE  GLOBAL DEFAULT ABS __bss_start@Qt_5
>   1082: 00000000004caf88     0 NOTYPE  GLOBAL DEFAULT ABS __bss_start@Qt_5

OK, I found these two symbols in libQt5Core.so.5.7.1 (sorry, I don't know what "qtbase-opensource-src 5.7.1+dfsg-2" refers to), but the test case only provides this file; it doesn't provide a repro for building it.


>   /usr/lib/x86_64-linux-gnu/libQt5Gui.so.5.7.1:(*IND*+0x0): multiple
> definition of `__bss_start'
>   /usr/lib/x86_64-linux-gnu/libQt5Gui.so.5.7.1:(*IND*+0x0): multiple
> definition of `_edata'
>   /usr/lib/x86_64-linux-gnu/libQt5Gui.so.5.7.1:(*IND*+0x0): multiple
> definition of `_end'

There's only one definition of each of these symbols in libQt5Gui.so.5.7.1, so I don't know why ld is complaining about duplicate symbols. These are shared libraries, and it shouldn't complain about duplicates across shared libraries.
Comment 10 Cary Coutant 2017-01-24 22:09:57 UTC
I had a theory about how to make this happen; it's similar to a problem I ran into with _GLOBAL_OFFSET_TABLE_ just last month. But no luck so far making it happen with __bss_start.

Is is possible that you're using a version of gold without the following commit? (This was backported to 2.28.)


commit 40d7d93ff412f4c34cde3daa04890d5cd2e0d9c9
Author: Cary Coutant <ccoutant@gmail.com>
Date:   Wed Dec 21 17:32:39 2016 -0800

    Fix problem where version script causes predefined hidden symbol to be defined twice.
    
    When creating a predefined hidden symbol like _GLOBAL_OFFSET_TABLE_, gold
    was incorrectly letting a version script add a version to the symbol,
    resulting in two copies of the symbol, both STB_LOCAL, but one of which
    was grouped in the globals part of the symbol table.
    
    gold/
            * symtab.cc (Symbol_table::define_special_symbol): Add is_forced_local
            parameter; if set, do not check version script.
            (Symbol_table::do_define_in_output_data): Pass is_forced_local for
            STB_LOCAL predefined symbols.
            (Symbol_table::do_define_in_output_segment): Likewise.
            (Symbol_table::do_define_in_output_segment): Likewise.
            (Symbol_table::do_define_as_constant): Likewise.
            * symtab.h (Symbol_table::define_special_symbol): Add is_forced_local
            parameter. Adjust all callers.
            * testsuite/Makefile.am (ver_test_8.sh): New test case.
            * testsuite/Makefile.in: Regenerate.
            * ver_test_8.sh: New test script.
Comment 11 Matthias Klose 2017-01-24 22:21:13 UTC
yes, this version of qt was built using 2.27.51.20161220. but the test case was done with a qt built using 2.27.90.20161231.