Bug 12131 - ARM ABI violations in both assembler and linker
Summary: ARM ABI violations in both assembler and linker
Status: NEW
Alias: None
Product: binutils
Classification: Unclassified
Component: ld (show other bugs)
Version: 2.20
: P2 normal
Target Milestone: ---
Assignee: unassigned
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2010-10-18 14:30 UTC by Stephen Clarke
Modified: 2022-08-25 23:17 UTC (History)
1 user (show)

See Also:
Host:
Target: arm*-*-*
Build:
Last reconfirmed:


Attachments
Test example (533 bytes, application/octet-stream)
2010-10-18 14:32 UTC, Stephen Clarke
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Stephen Clarke 2010-10-18 14:30:38 UTC
I'm seeing a problem with ARM GOT relocations, specifically
when the .got.plt section is not placed at the start of its output
section.  I have reproduced the problem with the binutils.weekly.bz2
tarball snapshot dated Oct 12 2010.

Here's my test example (also in the attachment):

$ cat gottest.s
        .text
_start:
        .global _start
        movw r0, :lower16:_GLOBAL_OFFSET_TABLE_-(here+8)
        movt r0, :upper16:_GLOBAL_OFFSET_TABLE_-(here+8)
        ldr  r1, littab
here:
littab:
        .word _GLOBAL_OFFSET_TABLE_-(here+8)

        .section .jcr, "aw"
        .space 1000

$ cat gottest.ld
OUTPUT_FORMAT("elf32-littlearm", "elf32-bigarm",
              "elf32-littlearm")
OUTPUT_ARCH(arm)
ENTRY(_start)
SECTIONS
{
 .text           :
 {
    *(.text)
    *(.glue_7t) *(.glue_7) *(.vfp11_veneer)
 }  =0
  .data           : ALIGN (8)
  {
    KEEP (*(.jcr))
    *(.got.plt) *(.got)
    *(.data)
  }
}

I build like this:
$ arm-none-eabi-as -o gottest.o gottest.s
$ arm-none-eabi-ld --script gottest.ld gottest.o

My expectation is that the code will load the same value
into r0 and r1, but here is the disassembly:
$ arm-none-eabi-objdump -d a.out

a.out:     file format elf32-littlearm


Disassembly of section .text:

00000000 <_start>:
   0:   e30003e4        movw    r0, #996        ; 0x3e4
   4:   e3400000        movt    r0, #0
   8:   e51f1004        ldr     r1, [pc, #-4]   ; c <here>

0000000c <here>:
   c:   fffffffc        .word   0xfffffffc


It seems that r0 is being loaded with 996, and r1 is loaded
with -4.  The difference (1000) is exactly the size of the .jcr
section I placed at the start of .data.
Looking at the resolved symbol values:
$ arm-none-eabi-nm a.out 
000003f8 d _GLOBAL_OFFSET_TABLE_
00000000 T _start
0000000c t here
0000000c t littab

it seems the correct value should be 0x3f8-(0xc+0x8) = 0x3e4 = 996, i.e. the movw/movt sequence is consistent with the symbol values.
However, the movw/movt sequence does not work when I use it in my
compiler in combination with R_ARM_GOT32 to get the address of a GOT entry, whereas the ldr instruction does work.

Everything is fine and consistent iff the .got sections are
placed right at the start of the .data output section (e.g. if the
.jcr section has size zero).  But as soon as there is something before
them, I get problems.

Steve.
Comment 1 Stephen Clarke 2010-10-18 14:32:31 UTC
Created attachment 5066 [details]
Test example
Comment 2 Alan Modra 2022-08-25 23:17:10 UTC
Yes, the GNU ARM assembler and linker make assumptions about the layout of the GOT section.  So if you write your own linker scripts you are likely to trip over those bugs.  Worse, linker relocation processing does not follow the ARM ABI.

The testcase ".word _GLOBAL_OFFSET_TABLE_-(here+8)" is going wrong firstly in the assembler.  This dodgy code, which I think should be deleted:

#ifdef OBJ_ELF
  if ((code == BFD_RELOC_32_PCREL || code == BFD_RELOC_32)
      && GOT_symbol
      && fixp->fx_addsy == GOT_symbol)
    {
      code = BFD_RELOC_ARM_GOTPC;
      reloc->addend = fixp->fx_offset = reloc->address;
    }
#endif

converts what would be R_ARM_REL32 or R_ARM_ABS32 relocations against _GLOBAL_OFFSET_TABLE_ into an R_ARM_BASE_PREL against _GLOBAL_OFFSET_TABLE_.  This is wrong for two reasons:
1) It cannot be correct to do this for both R_ARM_REL32 and R_ARM_ABS32 regardless of how the linker treats the final relocation.
2) If the linker correctly implements R_ARM_BASE_PREL then the assembler is assuming that _GLOBAL_OFFSET_TABLE_ is defined at the start of the GOT output section.

Then the GNU ARM linker violates the ABI by ignoring the symbol on R_ARM_BASE_PREL, assuming the reloc is only used with symbols defined in the GOT output section.  It also assumes _GLOBAL_OFFSET_TABLE_ is defined at the start of the GOT output section in relocations like R_ARM_GOTOFF32.

Am I going to do anything about this?  No.  Not being an ARM maintainer, I don't know enough history to judge the likelihood of breaking people's code that might rely on these bugs being present.