Bug 20475 - Linking PIC executables with a linker script that does not place _GLOBAL_OFFSET_TABLE_ results in incorrect relocation
Summary: Linking PIC executables with a linker script that does not place _GLOBAL_OFFS...
Status: RESOLVED FIXED
Alias: None
Product: binutils
Classification: Unclassified
Component: ld (show other bugs)
Version: 2.25
: P2 normal
Target Milestone: ---
Assignee: Not yet assigned to anyone
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2016-08-17 05:42 UTC by whitequark
Modified: 2024-02-29 23:44 UTC (History)
2 users (show)

See Also:
Host:
Target: or1k-*-*
Build:
Last reconfirmed:


Attachments
Patch (637 bytes, patch)
2016-08-17 06:05 UTC, whitequark
Details | Diff
OR1K target patch (897 bytes, patch)
2016-08-17 11:02 UTC, whitequark
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description whitequark 2016-08-17 05:42:22 UTC
I am using ld 2.25.1 built for or1k-elf target, and linking code built with -fPIC. On OR1K in PIC mode, addresses of global data are gathered with the following code (generated by the OR1K clang port):

.Ltmp2:
        l.movhi r3, gotpchi(_GLOBAL_OFFSET_TABLE_+(.Ltmp2-.L0$pb))
.Ltmp3:
        l.ori   r3, r3, gotpclo(_GLOBAL_OFFSET_TABLE_+(.Ltmp3-.L0$pb))
        l.add   r3, r9, r3
        l.movhi r4, gotoffhi(x)
        l.ori   r4, r4, gotofflo(x)
        l.add   r3, r4, r3

If I am using a linker script that does not explicitly do anything to the .got* sections, then it lays out the output ELF file as follows (without emitting any diagnostic message):

40047258 l    d  .got   00000000 .got
40047264 l    d  .got.plt       00000000 .got.plt
40047264 l     O .got.plt       00000000 _GLOBAL_OFFSET_TABLE_

This, of course, breaks the code above, because the gotoffhi/lo relocations calculate the offset from the start of the .got section, but the _GLOBAL_OFFSET_TABLE_ symbol points at the end of the .got.plt section.

The following addition to the linker script rectifies the problem:

	.got : {
		_GLOBAL_OFFSET_TABLE_ = .;
		*(.got)
	} > runtime

	.got.plt : {
		*(.got.plt)
	} > runtime

However, (1) I have never seen linker scripts that explicitly place _GLOBAL_OFFSET_TABLE_, and (2) above all else, this behavior seems clearly wrong. From what I can see, I think this is a target-specific OR1K bug, but I am not sure.
Comment 1 whitequark 2016-08-17 06:04:38 UTC
The previous comment is not quite right.

1) The _GLOBAL_OFFSET_TABLE_ did not point at the end of .got.plt but at the start.
2) This is a bug in the architecture-independent ELF backend.

The nature of the bug is very simple--it is an incorrect reuse of a global variable. I have attached a patch. I've looked into writing a test but I am not sure how to do it for this case. It would be enough to check for

        .got       00000000 _GLOBAL_OFFSET_TABLE_

instead of

        .got.plt       00000000 _GLOBAL_OFFSET_TABLE_

in objdump -t output.
Comment 2 whitequark 2016-08-17 06:05:09 UTC
Created attachment 9439 [details]
Patch
Comment 3 whitequark 2016-08-17 11:01:48 UTC
Actually, I might be wrong about this, since the comment in elflink.c implies that the behavior is intentional. I'll attach another patch, which fixes the logic of relocating R_OR1K_GOTOFF_* instead, making it follow the logic in elflink.c.
Comment 4 whitequark 2016-08-17 11:02:27 UTC
Created attachment 9442 [details]
OR1K target patch
Comment 5 Alan Modra 2016-08-17 23:31:05 UTC
Yes, your second patch is much better.  The first one probably would break some other target.  Even better would be to use the value of _GLOBAL_OFFSET_TABLE_

	  relocation
	    -= (htab->root.hgot->root.u.def.value
		+ htab->root.hgot->root.u.def.section->output_offset
		+ htab->root.hgot->root.u.def.section->output_section->vma);

Assuming the or1k backend doesn't have other bugs in this area, that would allow a script to define _GLOBAL_OFFSET_TABLE_ at somewhere other than the start of .got.plt.  If the above tests out OK, please post the patch to binutils@sourceware.org noting my preapproval, and commit.
Comment 6 whitequark 2016-08-20 11:09:58 UTC
I've verified that your suggested fix works. However, I don't have commit bit for binutils.
Comment 7 Sourceware Commits 2016-08-23 02:54:23 UTC
The master branch has been updated by Alan Modra <amodra@sourceware.org>:

https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;h=eacfca90f1ff457d3a7be9d593040218b6208d2b

commit eacfca90f1ff457d3a7be9d593040218b6208d2b
Author: Alan Modra <amodra@gmail.com>
Date:   Tue Aug 23 12:20:59 2016 +0930

    R_OR1K_GOTOFF_* relocations
    
    	PR 20475
    	* elf32-or1k.c (or1k_elf_relocate_section): Offset from
    	_GLOBAL_OFFSET_TABLE_, not start of .got section.
Comment 8 Alan Modra 2016-08-23 02:56:23 UTC
fixed.
Comment 9 whitequark 2016-08-30 08:09:48 UTC
Reopening since further testing indicates that Alan's suggestion only worked partially. My minimal testcase, as described in the first comment, works as expected now, but the non-minimized one crashes, in a different way. Specifically, it contains the rough equivalent of the following C code in Rust (built as -fPIC as before):

    struct fmt {
      void (*fn)(void *arg);
      void *arg;
    }

    void f() {
      struct fmt x = {
        .fn  = h,
        .arg = ...,
      };
      g(&x);
    }

    void g(struct fmt *x) {
      x->fn(x->arg);
    }

    void h(void *arg);

The way it crashes is when calling x->fn(); it jumps into a garbage address.

It is not yet clear to me what is the root cause. However, I suspect that R_OR1K_GOT16 is at fault, because pinning _GLOBAL_OFFSET_TABLE_ to the beginning of .got as before makes the code work as expected.
Comment 10 whitequark 2016-08-30 08:33:19 UTC
Unfortunately the problem resists my attempts to write a minimal testcase. I do not have any more time to spend debugging this, although I will gladly test any suggestions.

As a fix, I suggest forcing _GLOBAL_OFFSET_TABLE_ to the beginning of .got and this is the workaround I will be using locally.
Comment 11 Alan Modra 2024-02-29 23:44:41 UTC
I think this has been fixed, if not a report against 2.25 is way too old.