Bug 22266 - ld.gold produces invalid output when linking with --relocatable
Summary: ld.gold produces invalid output when linking with --relocatable
Status: RESOLVED FIXED
Alias: None
Product: binutils
Classification: Unclassified
Component: gold (show other bugs)
Version: 2.30
: P2 normal
Target Milestone: ---
Assignee: Cary Coutant
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2017-10-05 22:25 UTC by Ben Gamari
Modified: 2017-11-28 01:45 UTC (History)
3 users (show)

See Also:
Host:
Target:
Build:
Last reconfirmed: 2017-11-14 00:00:00


Attachments
A testcase (1.80 KB, application/octet-stream)
2017-11-14 16:36 UTC, H.J. Lu
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Ben Gamari 2017-10-05 22:25:49 UTC
I recently noticed that the Glasgow Haskell Compiler would break when built with split sections enabled. After a long process of debugging (https://ghc.haskell.org/trac/ghc/ticket/14291), I found that the cause is that ld.gold produces incorrect output when producing relocatable object files.

To demonstrate the issue I put together a small reproduction case fo build anr amd64 here:  https://github.com/bgamari/T14291-repro.

The testcase consists of three object files taken from a GHCd a linker script. When linked with `ld.bfd -r` the testcase produces the expected output. However, when `ld.gold -r` is used the resulting object file appears to be mangled. Specifically, the input object `Err.o` contains the symbol,

    $ objdump -D Err.o | grep -A6 '<r1Cy_closure>'
    0000000000000000 <r1Cy_closure>:
            ...
      10:   03 00                   add    (%rax),%eax
      12:   00 00                   add    %al,(%rax)
      14:   00 00                   add    %al,(%rax)
            ...

Whereas the output object produced by `ld.gold` contains,

    $ ld.gold --version
    GNU gold (GNU Binutils for Debian 2.28) 1.14
    Copyright (C) 2017 Free Software Foundation, Inc.
    This program is free software; you may redistribute it under the terms of
    the GNU General Public License version 3 or (at your option) a later version.
    This program has absolutely no warranty.

    $ ld.gold -r -T merge_sections.ld -o out.o Concurrent.o Base.o Err.o

    $ objdump -D out.o | grep -A6 '<r1Cy_closure>'
    0000000000001548 <r1Cy_closure>:
            ...

There is no other symbol named r1Cy_closure in any of the other linked objects.
Comment 1 Ben Gamari 2017-10-05 22:26:39 UTC
This has been observed with both binutils 2.28 packaged in Debian Stretch as well as the current `master` branch.
Comment 2 James Clarke 2017-10-14 23:49:17 UTC
Proposed patch: https://sourceware.org/ml/binutils/2017-10/msg00224.html
Comment 3 cvs-commit@gcc.gnu.org 2017-11-08 23:14:34 UTC
The master branch has been updated by Cary Coutant <ccoutant@sourceware.org>:

https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;h=333d0055f6f162c334c36f1946b6fcdb5c92b681

commit 333d0055f6f162c334c36f1946b6fcdb5c92b681
Author: James Clarke <jrtc27@jrtc27.com>
Date:   Wed Nov 8 15:13:53 2017 -0800

    Fix problems with -r.
    
    The fix committed for PR gold/19291 ended up breaking other cases. The
    commit added adjustment code to write_local_symbols, but in many cases
    compute_final_local_value_internal had already subtracted the output
    section's address. To fix this, all other adjustments are now removed, so
    only the one in write_local_symbols is left.
    
    gold/
    	PR gold/22266
    	* object.cc (Sized_relobj_file::compute_final_local_value_internal):
    	Drop relocatable parameter and stop adjusting output value based on
    	it.
    	(Sized_relobj_file::compute_final_local_value): Stop passing
    	relocatable to compute_final_local_value_internal.
    	(Sized_relobj_file::do_finalize_local_symbols): Ditto.
    	* object.h (Sized_relobj_file::compute_final_local_value_internal):
    	Drop relocatable parameter.
Comment 4 Cary Coutant 2017-11-08 23:16:54 UTC
Fixed on master branch.
Comment 5 H.J. Lu 2017-11-14 16:04:19 UTC
It fails with GCC 4.2 on i686:

FAIL: pr22266 (exit: 134)
Comment 6 H.J. Lu 2017-11-14 16:15:20 UTC
(gdb) r
Starting program: /export/build/gnu/binutils-gold/build-i686-linux/gold/testsuite/pr22266 

Program received signal SIGABRT, Aborted.
0x00164410 in __kernel_vsyscall ()
(gdb) bt
#0  0x00164410 in __kernel_vsyscall ()
#1  0x00b8adf0 in raise () from /lib/libc.so.6
#2  0x00b8c701 in abort () from /lib/libc.so.6
#3  0x0804841e in main ()
    at /export/gnu/import/git/sources/binutils-gdb/gold/testsuite/pr22266_main.c:7
(gdb) f 3
#3  0x0804841e in main ()
    at /export/gnu/import/git/sources/binutils-gdb/gold/testsuite/pr22266_main.c:7
7	    abort ();
(gdb) list
2	
3	extern int *p_int_from_a_2;
4	
5	int main (void) {
6	  if (*p_int_from_a_2 != 0x11223344)
7	    abort ();
8	  return 0;
9	}
(gdb) p *p_int_from_a_2
$1 = 0
(gdb)
Comment 7 H.J. Lu 2017-11-14 16:36:18 UTC
Created attachment 10589 [details]
A testcase

[hjl@gnu-6 tmp]$ ld.bfd -m elf_i386 -r -T /export/gnu/import/git/sources/binutils-gdb/gold/testsuite/pr22266_script.t -o pr22266_ar.o pr22266_a.o
[hjl@gnu-6 tmp]$ gcc -m32 pr22266_main.o pr22266_ar.o
[hjl@gnu-6 tmp]$ ./a.out
[hjl@gnu-6 tmp]$ /export/build/linux/binutils-x32/build-x86_64-linux-gnux32/gold/ld-new -m elf_i386 -r -T /export/gnu/import/git/sources/binutils-gdb/gold/testsuite/pr22266_script.t -o pr22266_ar.o pr22266_a.o
[hjl@gnu-6 tmp]$ gcc -m32 pr22266_main.o pr22266_ar.o
[hjl@gnu-6 tmp]$ ./a.out
Aborted
[hjl@gnu-6 tmp]$
Comment 8 H.J. Lu 2017-11-14 16:49:21 UTC
(In reply to H.J. Lu from comment #7)
> Created attachment 10589 [details]
> A testcase
> 
> [hjl@gnu-6 tmp]$ ld.bfd -m elf_i386 -r -T
> /export/gnu/import/git/sources/binutils-gdb/gold/testsuite/pr22266_script.t
> -o pr22266_ar.o pr22266_a.o
> [hjl@gnu-6 tmp]$ gcc -m32 pr22266_main.o pr22266_ar.o
> [hjl@gnu-6 tmp]$ ./a.out
> [hjl@gnu-6 tmp]$
> /export/build/linux/binutils-x32/build-x86_64-linux-gnux32/gold/ld-new -m
> elf_i386 -r -T
> /export/gnu/import/git/sources/binutils-gdb/gold/testsuite/pr22266_script.t
> -o pr22266_ar.o pr22266_a.o
> [hjl@gnu-6 tmp]$ gcc -m32 pr22266_main.o pr22266_ar.o
> [hjl@gnu-6 tmp]$ ./a.out
> Aborted
> [hjl@gnu-6 tmp]$

Revert

    Fix problems with -r.
    
    The fix committed for PR gold/19291 ended up breaking other cases. The
    commit added adjustment code to write_local_symbols, but in many cases
    compute_final_local_value_internal had already subtracted the output
    section's address. To fix this, all other adjustments are now removed, so
    only the one in write_local_symbols is left.
    
    gold/
            PR gold/22266
            * object.cc (Sized_relobj_file::compute_final_local_value_internal):
            Drop relocatable parameter and stop adjusting output value based on
            it.
            (Sized_relobj_file::compute_final_local_value): Stop passing
            relocatable to compute_final_local_value_internal.
            (Sized_relobj_file::do_finalize_local_symbols): Ditto.
            * object.h (Sized_relobj_file::compute_final_local_value_internal):
            Drop relocatable parameter.

fixed this testcase.
Comment 9 H.J. Lu 2017-11-14 16:55:51 UTC
Symbol table is wrong:

--- 2	2017-11-14 08:54:05.933477729 -0800
+++ 1	2017-11-14 08:54:23.381514511 -0800
@@ -84,7 +84,7 @@ Symbol table '.symtab' contains 10 entri
      5: 00000000     0 SECTION LOCAL  DEFAULT    7 
      6: 00000000     0 SECTION LOCAL  DEFAULT    9 
      7: 00000000     0 FILE    LOCAL  DEFAULT  ABS pr22266_a.c
-     8: fffffffc     4 OBJECT  LOCAL  DEFAULT    2 int_from_a_1
^^^^^^^^^^^^^^^^^^ Good

+     8: 00000000     4 OBJECT  LOCAL  DEFAULT    2 int_from_a_1
^^^^^^^^^^^^^^^^^^^ Bad
      9: 00000000     4 OBJECT  GLOBAL DEFAULT    4 p_int_from_a_2
Comment 10 H.J. Lu 2017-11-14 17:39:57 UTC
The content of .data.rel.ro section is also wrong.
Comment 11 H.J. Lu 2017-11-14 17:48:49 UTC
This isn't GCC version related.  It failed on all REL targets.
RELA targets are OK.
Comment 12 James Clarke 2017-11-14 18:01:40 UTC
(In reply to H.J. Lu from comment #9)
> Symbol table is wrong:
> 
> --- 2	2017-11-14 08:54:05.933477729 -0800
> +++ 1	2017-11-14 08:54:23.381514511 -0800
> @@ -84,7 +84,7 @@ Symbol table '.symtab' contains 10 entri
>       5: 00000000     0 SECTION LOCAL  DEFAULT    7 
>       6: 00000000     0 SECTION LOCAL  DEFAULT    9 
>       7: 00000000     0 FILE    LOCAL  DEFAULT  ABS pr22266_a.c
> -     8: fffffffc     4 OBJECT  LOCAL  DEFAULT    2 int_from_a_1
> ^^^^^^^^^^^^^^^^^^ Good
> 
> +     8: 00000000     4 OBJECT  LOCAL  DEFAULT    2 int_from_a_1
> ^^^^^^^^^^^^^^^^^^^ Bad
>       9: 00000000     4 OBJECT  GLOBAL DEFAULT    4 p_int_from_a_2

It's not that the symbol table was wrong. Your so-called "good" version is wrong; the symbol is at offset 0 from its section .data, so it really should have value 0 in the object file like it does now. The problem is that, it seems, somewhere, the REL was accounting for this (which is probably why the symbol value issue was not noticed earlier), and now that the symbol value is fixed, the REL case is broken. You can see for yourself that the problem is that the object file contents at "p_int_from_a_2" is 0x4, but this addend should be 0.
Comment 13 Ben Gamari 2017-11-27 14:08:12 UTC
Has this been re-applied?
Comment 14 cvs-commit@gcc.gnu.org 2017-11-28 01:34:39 UTC
The master branch has been updated by Cary Coutant <ccoutant@sourceware.org>:

https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;h=033bfb739b525703bfe23f151d09e9beee3a2afe

commit 033bfb739b525703bfe23f151d09e9beee3a2afe
Author: Cary Coutant <ccoutant@gmail.com>
Date:   Mon Nov 27 17:32:55 2017 -0800

    Fix symbol values and relocation addends for relocatable links.
    
    The fix for PR 19291 broke some other cases where -r is used with scripts,
    as reported in PR 22266. The original fix for PR 22266 ended up breaking
    many cases for REL targets, where the addends are stored in the section data,
    and are not being adjusted properly.
    
    The problem was basically that in a relocatable output file (ET_REL),
    symbol values are supposed to be relative to the start address of their
    section. Usually in a relocatable file, all sections start at 0, so the
    failure to get this right is often irrelevant, but with a linker script,
    we occasionally see an output section whose starting address is not 0,
    and gold would occasionally write a symbol with its relocated value instead
    of its section-relative value.
    
    This patch reverts the recent fix for PR 22266 as well as my original fix
    for PR 19291. The original fix moved the symbol value adjustment to
    write_local_symbols, but neglected to undo a few places where the adjustment
    was also being applied, resulting in an occasional double adjustment. The
    more recent fix removed those other adjustments, but then failed to
    re-account for the adjustment when rewriting the relocations on REL targets.
    
    With the old attempts reverted, we now apply the symbol value adjustment to
    the one case that had been missed (non-section symbols in merge sections).
    But now we also need to account for the adjustment when rewriting the addends
    for RELA relocations.
    
    gold/
    	PR gold/19291
    	PR gold/22266
    	* object.cc (Sized_relobj_file::compute_final_local_value_internal):
    	Revert changes from 2017-11-08 patch.  Adjust symbol value in
    	relocatable links for non-section symbols.
    	(Sized_relobj_file::compute_final_local_value): Revert changes from
    	2017-11-08 patch.
    	(Sized_relobj_file::do_finalize_local_symbols): Likewise.
    	(Sized_relobj_file::write_local_symbols): Revert changes from
    	2015-11-25 patch.
    	* object.h (Sized_relobj_file::compute_final_local_value_internal):
    	Revert changes from 2017-11-08 patch.
    	* powerpc.cc (Target_powerpc::relocate_relocs): Adjust addend for
    	relocatable links.
    	* target-reloc.h (relocate_relocs): Adjust addend for relocatable links.
    	* testsuite/pr22266_a.c (hello): New function.
    	* testsuite/pr22266_main.c (main): Add test for merge sections.
    	* testsuite/pr22266_script.t: Add rule for .rodata.
Comment 15 Cary Coutant 2017-11-28 01:45:25 UTC
Should be fixed on trunk now.