Bug 16794 - gold ignores R_386_GOTOFF addend
Summary: gold ignores R_386_GOTOFF addend
Status: RESOLVED FIXED
Alias: None
Product: binutils
Classification: Unclassified
Component: gold (show other bugs)
Version: unspecified
: P2 normal
Target Milestone: 2.34
Assignee: Alan Modra
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2014-04-01 20:59 UTC by Rafael Ávila de Espíndola
Modified: 2019-09-28 07:20 UTC (History)
5 users (show)

See Also:
Host:
Target:
Build:
Last reconfirmed:


Attachments
testcase (1.73 KB, application/x-tar)
2014-04-01 20:59 UTC, Rafael Ávila de Espíndola
Details
testcase 2 (540 bytes, text/plain)
2019-09-10 03:51 UTC, Ryan Prichard
Details
possible fix (376 bytes, patch)
2019-09-11 03:04 UTC, Alan Modra
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Rafael Ávila de Espíndola 2014-04-01 20:59:12 UTC
Created attachment 7516 [details]
testcase

The attached testcase has both 32 and 64 bit versions of a test. The file test.o contains relocations to a mergeable section. In the 32 bit case it has:


00000012  00000509 R_386_GOTOFF      00000000   .rodata.str1.1
0000001c  00000509 R_386_GOTOFF      00000000   .rodata.str1.1

The "implicit addend" are in the two lea instructions:

objdump  -d test.o

  10:	8d 83 07 00 00 00    	lea    0x7(%ebx),%eax
  16:	89 44 24 04          	mov    %eax,0x4(%esp)
  1a:	8d 83 00 00 00 00    	lea    0x0(%ebx),%eax

On the gold produced output, the distance between the two is still 7 (0x11ac- 0x11a5)

 80484e0:       8d 83 5b ee ff ff       lea    -0x11a5(%ebx),%eax
 80484e6:       89 44 24 04             mov    %eax,0x4(%esp)
 80484ea:       8d 83 54 ee ff ff       lea    -0x11ac(%ebx),%eax

The the actual section has been modified to merge the strings, so that is no longer valid.

Using bfd ld, the offset is updated:

 8048460:       8d 83 4d ee ff ff       lea    -0x11b3(%ebx),%eax
 8048466:       89 44 24 04             mov    %eax,0x4(%esp)
 804846a:       8d 83 4c ee ff ff       lea    -0x11b4(%ebx),%ea

Everything works on 64 bits. I assume that is because it uses RELA relocations. In 64 bits the test.o file has

000000000003  000500000002 R_X86_64_PC32     0000000000000000 .rodata.str1.1 + 0
00000000000a  000500000002 R_X86_64_PC32     0000000000000000 .rodata.str1.1 + 7


   0:	48 8d 3d 00 00 00 00 	lea    0x0(%rip),%rdi
   7:	48 8d 35 00 00 00 00 	lea    0x0(%rip),%rsi

and the final binary is update correctly

  400530:       48 8d 3d ad 00 00 00    lea    0xad(%rip),%rdi
  400537:       48 8d 35 a7 00 00 00    lea    0xa7(%rip),%rsi
Comment 1 Yin Ma 2016-06-03 18:56:37 UTC
This patch made all DWARF DW_AT_name generation are wrong for ARM. Please take a look again. I would like to know how LLVM need to work with gold linker. Is this patch still needed?

The detail is here 
https://llvm.org/bugs/show_bug.cgi?id=27985
Comment 2 Rafael Ávila de Espíndola 2016-06-04 00:19:28 UTC
(In reply to Yin Ma from comment #1)
> This patch made all DWARF DW_AT_name generation are wrong for ARM. Please
> take a look again. I would like to know how LLVM need to work with gold
> linker. Is this patch still needed?

Sorry, I don' this patch was ever committed. You man that this bug causes problems?
 
> The detail is here 
> https://llvm.org/bugs/show_bug.cgi?id=27985
Comment 3 Fangrui Song 2019-07-09 07:31:49 UTC
This has been fixed, though I'm not sure which commit fixed the bug. binutils 2.26 is known to work.
Comment 4 Ryan Prichard 2019-09-10 03:50:04 UTC
I looked at the testcase briefly, and it looked like test.o contained two strings, "xabcde" and "abcde". ld.bfd merged the two, but ld.gold didn't. I used "GNU gold (GNU Binutils for Debian 2.31.1) 1.16".

However, if I write a test case where the strings/constants are identical, they *are* merged, and gold doesn't reliably update the GOT offset on the relocations. Gold seems to handle a relocation to a local symbol within a mergeable section, but not a relocation to the section directly.

I'll attach a test case that writes a couple of x86-32 assembly files and runs them on a Linux machine. I'm using the syntax, "(.rodata.str1.1+4)@GOTOFF(%ebx)" on x86-32 to reference the section rather than a symbol.

LLVM has a workaround for this bug. It was briefly reverted, then restored: https://reviews.llvm.org/D64327.
Comment 5 Ryan Prichard 2019-09-10 03:51:51 UTC
Created attachment 11983 [details]
testcase 2
Comment 6 Ryan Prichard 2019-09-10 04:36:28 UTC
I attached another test case to the Android NDK issue, https://github.com/android/ndk/issues/1076.
Comment 7 Alan Modra 2019-09-10 05:25:46 UTC
Your assembly is broken.  In a string merge section, sym+offset is valid only for offset within the string at sym.

I assume that the original testcase had the same sort of error as there too the relocs are against the section symbol.
Comment 8 Ryan Prichard 2019-09-10 20:14:39 UTC
> In a string merge section, sym+offset is valid only for offset within the string at sym.

The GNU assembler is apparently willing to use section+offset to identify a mergeable string for some relocations (maybe non-PC-relative / non-GOT relocations?).

e.g.:

extern int puts(const char*);
void func() { puts("foo1"); }
void func2() { puts("foo2"); }

If I compile with "gcc -m32 -fno-pie -O2 -c", I see these relocations in the object file:

00000004  00000501 R_386_32               00000000   .rodata.str1.1
00000024  00000501 R_386_32               00000000   .rodata.str1.1

The assembly references local symbols (.LC0 and .LC1) within .rodata.str1.1, but the assembler has omitted the symbols in favor of relocations to the section itself. The second REL relocation has an addend of 5 stored inside the .text content.

gold appears to handle merging section+offset relocations fine when they're R_386_32, but not when they're R_386_GOTOFF. Both relocation types are documented in the psABI as allowing an addend.

Also:

> I looked at the testcase briefly, and it looked like test.o contained two strings, "xabcde" and "abcde". ld.bfd merged the two, but ld.gold didn't. I used "GNU gold (GNU Binutils for Debian 2.31.1) 1.16".

gold's overlapping string merging happens when -O2 is passed to ld.gold, and I hadn't passed -O2. When I do, I see the reported/unwanted ld.gold behavior.
Comment 9 Alan Modra 2019-09-11 02:21:46 UTC
Oops, my memory of the string merge restriction on symbol addends was wrong.

Yes, you can use section_sym+addend to point at strings in a string merge section.
Comment 10 Alan Modra 2019-09-11 03:04:40 UTC
Created attachment 11986 [details]
possible fix

It's a long time since I maintained x86 binutils, but this untested patch will likely fix the problem.
Comment 11 Alan Modra 2019-09-26 02:21:55 UTC
OK, so my fix wasn't correct since Relocate_functions<32, false>::rel32(view, value) applies the addend again.  This would have worked:

-       elfcpp::Elf_types<32>::Elf_Addr value;
-       value = (psymval->value(object, 0)
-                - target->got_plt_section()->address());
-       Relocate_functions<32, false>::rel32(view, value);
+       typedef typename elfcpp::Swap<32, false>::Valtype Valtype;
+       Valtype* wv = reinterpret_cast<Valtype*>(view);
+       Valtype addend = elfcpp::Swap<32, false>::readval(wv);
+       Valtype value = (psymval->value(object, addend)
+                        - target->got_plt_section()->address());
+       elfcpp::Swap<32, false>::writeval(wv, value);

However, that looks very much like a pcrel32 using object/psymval, thus

	elfcpp::Elf_types<32>::Elf_Addr reladdr;
	reladdr = target->got_plt_section()->address();
	Relocate_functions<32, false>::pcrel32(view, object, psymval, reladdr);
Comment 12 cvs-commit@gcc.gnu.org 2019-09-28 07:19:24 UTC
The master branch has been updated by Alan Modra <amodra@sourceware.org>:

https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;h=ea8e302e12bd155a3acd79290ec87d7dda2cce61

commit ea8e302e12bd155a3acd79290ec87d7dda2cce61
Author: Alan Modra <amodra@gmail.com>
Date:   Wed Sep 11 13:22:42 2019 +0930

    PR16794, gold ignores R_386_GOTOFF addend
    
    An R_386_GOTOFF relocation has an addend, typically used when a
    symbol can be replaced by its section symbol plus an offset.
    psymval->value(object,0) is quite wrong then, fix it.
    
    	PR 16794
    	* i386.cc (Target_i386::Relocate::relocate <R_386_GOTOFF>): Don't
    	ignore addend, apply using pcrel32.
    	* x86_64.cc (Target_x86_64::Relocate::relocate <R_X86_64_GOTOFF64>):
    	Similarly use pcrel64.
Comment 13 Alan Modra 2019-09-28 07:20:25 UTC
Fixed for 2.34.