Bug 22764 - [2.30 Regression] ld fails to link 4.13 and 4.15 kernels on aarch64-linux-gnu
Summary: [2.30 Regression] ld fails to link 4.13 and 4.15 kernels on aarch64-linux-gnu
Status: NEW
Alias: None
Product: binutils
Classification: Unclassified
Component: ld (show other bugs)
Version: 2.30
: P2 normal
Target Milestone: ---
Assignee: Not yet assigned to anyone
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2018-01-31 12:16 UTC by Matthias Klose
Modified: 2018-02-22 12:54 UTC (History)
8 users (show)

See Also:
Host:
Target: aarch64-linux-gnu
Build:
Last reconfirmed:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Matthias Klose 2018-01-31 12:16:02 UTC
seen with 4.13 and 4.15 kernel builds, works with the 2.29 branch, aarch64-linux-gnu.
log at
https://launchpadlibrarian.net/355195922/buildlog_ubuntu-bionic-arm64.linux_4.15.0-6.7_BUILDING.txt.gz

  LD      vmlinux.o
  MODPOST vmlinux.o
ld: arch/arm64/kernel/head.o: relocation R_AARCH64_ABS32 against `_kernel_offset_le_lo32' can not be used when making a shared object
ld: arch/arm64/kernel/efi-entry.stub.o: relocation R_AARCH64_ABS32 against `__efistub_stext_offset' can not be used when making a shared object
arch/arm64/kernel/head.o: In function `kimage_vaddr':
(.idmap.text+0x0): dangerous relocation: unsupported relocation
arch/arm64/kernel/head.o: In function `__primary_switch':
/<<PKGBUILDDIR>>/arch/arm64/kernel/head.S:772:(.idmap.text+0x340): dangerous relocation: unsupported relocation
/<<PKGBUILDDIR>>/arch/arm64/kernel/head.S:772:(.idmap.text+0x348): dangerous relocation: unsupported relocation
/<<PKGBUILDDIR>>/Makefile:1026: recipe for target 'vmlinux' failed
make[2]: *** [vmlinux] Error 1
Comment 1 Ard Biesheuvel 2018-01-31 12:24:37 UTC
The arm64 Linux kernel uses absolute ELF symbols to expose various build time constants whose values are only known after linking to the program itself.

The size of the loadable image in little endian format (even on BE builds)
The memory footprint of the image in LE
The offset to and size of the RELA section, relative to the start of the image (on KASLR kernels)

0000000000000000 A _kernel_flags_le_hi32
000000000000000a A _kernel_flags_le_lo32
0000000000000000 A _kernel_offset_le_hi32
0000000000080000 A _kernel_offset_le_lo32
0000000000000000 A _kernel_size_le_hi32
00000000013b5000 A _kernel_size_le_lo32
00000000004afa00 A __pecoff_data_rawsize
000000000051d000 A __pecoff_data_size
0000000000000200 A PECOFF_FILE_ALIGNMENT
0000000000fa3898 A __rela_offset
00000000002e2ab0 A __rela_size

The KASLR kernel is a PIE executable, and is no longer allowed to refer to these symbols via R_AARCH64_ABS32 relocations, resulting in the build error reported by Matthias.

So please explain how a PIE executable should refer to such absolute ELF symbols if not via R_AARCH64_ABS32 relocations.
Comment 2 Ard Biesheuvel 2018-01-31 12:58:17 UTC
From commit 79e741920446582bd0e09f3e2b9f899c258efa56

    R_AARCH64_ABS64 under LP64 is allowed in shared object and a dynamic relocation entry
    will be generated. This allows the dynamic linker to do further symbol resolution.
    R_AARCH64_ABS32 likewise is allowed in shared object, however under ILP32 abi.

    The original behavior for R_AARCH64_ABS32 under LP64 is that, it's allowed
    in shared object and silently resolved at static linking time.
    No dynamic relocation entry is generate for it.

One could argue that absolute relocations against *absolute* ELF symbols should always be resolved at static link time, but I am aware that, for historical reasons, symbols like __GLOBAL_OFFSET_TABLE__ are emitted as absolute, making this difficult to realise in practice.
Comment 3 H.J. Lu 2018-01-31 13:08:16 UTC
(In reply to Ard Biesheuvel from comment #2)
> 
> One could argue that absolute relocations against *absolute* ELF symbols
> should always be resolved at static link time, but I am aware that, for
> historical reasons, symbols like __GLOBAL_OFFSET_TABLE__ are emitted as
> absolute, making this difficult to realise in practice.

Not true on x86:

  3987: 00000000003dd000     0 OBJECT  LOCAL  DEFAULT   33 _GLOBAL_OFFSET_TABLE_
Comment 4 Ard Biesheuvel 2018-01-31 13:11:21 UTC
(In reply to H.J. Lu from comment #3)
> (In reply to Ard Biesheuvel from comment #2)
> > 
> > One could argue that absolute relocations against *absolute* ELF symbols
> > should always be resolved at static link time, but I am aware that, for
> > historical reasons, symbols like __GLOBAL_OFFSET_TABLE__ are emitted as
> > absolute, making this difficult to realise in practice.
> 
> Not true on x86:
> 
>   3987: 00000000003dd000     0 OBJECT  LOCAL  DEFAULT   33
> _GLOBAL_OFFSET_TABLE_

Oh right.

Well, in any case, please refer to this ticket

https://sourceware.org/bugzilla/show_bug.cgi?id=20402

and the link in the comments for more discussion on this topic.
Comment 5 Peter Smith 2018-01-31 18:28:35 UTC
I think that the new error message for R_AARCH64_ABS32 from the linker makes some sense if the destination symbol is section relative as there is no dynamic relocation supported and truncating a 64-bit address is most likely a mistake.

However if the destination symbol is absolute the linker shouldn't make the assumption that the symbol is an address so it should resolve the relocation at static link-time.

I think the test:
	case BFD_RELOC_AARCH64_16:
#if ARCH_SIZE == 64
	case BFD_RELOC_AARCH64_32:
#endif
	  if (bfd_link_pic (info)
	      && (sec->flags & SEC_ALLOC) != 0
	      && (sec->flags & SEC_READONLY) != 0)
            ... Give error message
Should check that the symbol is not absolute as well.

Unfortunately I can't think of a workaround for the case where the value of the symbols has to be in the RO-segment. For some reason the check only applies in RO sections, which does not make a lot of sense to me as a R_AARCH64_ABS32 from a RW section to an address will truncate it in the same way as if it were from a RO section. No dynamic relocation is generated for either RO or RW so I don't know why the distinction has been made.
Comment 6 Matthias Klose 2018-02-02 16:50:02 UTC
systemd on aarch64 configured with efi support fails with a similar relocation error:

ld -o src/boot/efi/systemd_boot.so -T /usr/lib/elf_aarch64_efi.lds -shared -Bsymbolic -nostdlib -znocombreloc -L /usr/lib /usr/lib/crt0-efi-aarch64.o --defsym=EFI_SUBSYSTEM=0xa src/boot/efi/disk.c.o src/boot/efi/graphics.c.o src/boot/efi/measure.c.o src/boot/efi/pe.c.o src/boot/efi/util.c.o src/boot/efi/boot.c.o src/boot/efi/console.c.o src/boot/efi/shim.c.o -lefi -lgnuefi /usr/lib/gcc/aarch64-linux-gnu/7/libgcc.a
ld: /usr/lib/crt0-efi-aarch64.o: relocation R_AARCH64_ABS16 against `EFI_SUBSYSTEM' can not be used when making a shared object

complete build log at
https://launchpadlibrarian.net/355386549/buildlog_ubuntu-bionic-arm64.systemd_237-1ubuntu1_BUILDING.txt.gz


Related to:

2017-12-13  Renlin Li  <renlin.li@arm.com>

        * elfnn-aarch64.c (elfNN_aarch64_check_relocs): Disallow
        BFD_RELOC_AARCH64_16 in shared object const section. Disallow
        BFD_RELOC_AARCH64_32 in shared object const section under LP64.
Comment 7 Andrew Pinski 2018-02-02 16:56:10 UTC
(In reply to Matthias Klose from comment #6)
> systemd on aarch64 configured with efi support fails with a similar
> relocation error:
> 
> ld -o src/boot/efi/systemd_boot.so -T /usr/lib/elf_aarch64_efi.lds -shared
> -Bsymbolic -nostdlib -znocombreloc -L /usr/lib /usr/lib/crt0-efi-aarch64.o
> --defsym=EFI_SUBSYSTEM=0xa src/boot/efi/disk.c.o src/boot/efi/graphics.c.o
> src/boot/efi/measure.c.o src/boot/efi/pe.c.o src/boot/efi/util.c.o
> src/boot/efi/boot.c.o src/boot/efi/console.c.o src/boot/efi/shim.c.o -lefi
> -lgnuefi /usr/lib/gcc/aarch64-linux-gnu/7/libgcc.a
> ld: /usr/lib/crt0-efi-aarch64.o: relocation R_AARCH64_ABS16 against
> `EFI_SUBSYSTEM' can not be used when making a shared object

This is a bug in either in gnu-efi or systemd.  EFI_SUBSYSTEM is in the pe-coff header so we don't want any relocation there :).  Basically EFI_SUBSYSTEM is not being defined.  Note Uboot has a similar bug too.
Comment 8 Ard Biesheuvel 2018-02-02 16:59:55 UTC
(In reply to Andrew Pinski from comment #7)
> (In reply to Matthias Klose from comment #6)
> > systemd on aarch64 configured with efi support fails with a similar
> > relocation error:
> > 
> > ld -o src/boot/efi/systemd_boot.so -T /usr/lib/elf_aarch64_efi.lds -shared
> > -Bsymbolic -nostdlib -znocombreloc -L /usr/lib /usr/lib/crt0-efi-aarch64.o
> > --defsym=EFI_SUBSYSTEM=0xa src/boot/efi/disk.c.o src/boot/efi/graphics.c.o
> > src/boot/efi/measure.c.o src/boot/efi/pe.c.o src/boot/efi/util.c.o
> > src/boot/efi/boot.c.o src/boot/efi/console.c.o src/boot/efi/shim.c.o -lefi
> > -lgnuefi /usr/lib/gcc/aarch64-linux-gnu/7/libgcc.a
> > ld: /usr/lib/crt0-efi-aarch64.o: relocation R_AARCH64_ABS16 against
> > `EFI_SUBSYSTEM' can not be used when making a shared object
> 
> This is a bug in either in gnu-efi or systemd.  EFI_SUBSYSTEM is in the
> pe-coff header so we don't want any relocation there :).  Basically
> EFI_SUBSYSTEM is not being defined.  Note Uboot has a similar bug too.

The PE/COFF header is part of the static GNU=EFI library, and uses a static relocation to populate the efi subsystem field when incorporated into a EFI executable. The ELF spec allows this, so if there is a bug here, it is in ld.bfd not in GNU-EFI, systemd or u-boot.
Comment 9 Renlin Li 2018-02-02 19:37:53 UTC
(In reply to Peter Smith from comment #5)
> I think that the new error message for R_AARCH64_ABS32 from the linker makes
> some sense if the destination symbol is section relative as there is no
> dynamic relocation supported and truncating a 64-bit address is most likely
> a mistake.
> 
> However if the destination symbol is absolute the linker shouldn't make the
> assumption that the symbol is an address so it should resolve the relocation
> at static link-time.
> 
> I think the test:
> 	case BFD_RELOC_AARCH64_16:
> #if ARCH_SIZE == 64
> 	case BFD_RELOC_AARCH64_32:
> #endif
> 	  if (bfd_link_pic (info)
> 	      && (sec->flags & SEC_ALLOC) != 0
> 	      && (sec->flags & SEC_READONLY) != 0)
>             ... Give error message
> Should check that the symbol is not absolute as well.
> 
> Unfortunately I can't think of a workaround for the case where the value of
> the symbols has to be in the RO-segment. For some reason the check only
> applies in RO sections, which does not make a lot of sense to me as a
> R_AARCH64_ABS32 from a RW section to an address will truncate it in the same
> way as if it were from a RO section. No dynamic relocation is generated for
> either RO or RW so I don't know why the distinction has been made.

Indeed, for a absolute symbol, the assumption that it represents an address is not correct.
A check should be added to allow absolute symbol with R_AARCH64_ABS relocation.

The condition here is to apply the constrain only in allocatable text or read-only data section, where I though is more likely to be a place to store fixed address.

I will prepare a patch, trying to fix the absolute symbol case.
Comment 10 Ard Biesheuvel 2018-02-02 22:40:38 UTC
(In reply to Renlin Li from comment #9)
> (In reply to Peter Smith from comment #5)
> > I think that the new error message for R_AARCH64_ABS32 from the linker makes
> > some sense if the destination symbol is section relative as there is no
> > dynamic relocation supported and truncating a 64-bit address is most likely
> > a mistake.
> > 
> > However if the destination symbol is absolute the linker shouldn't make the
> > assumption that the symbol is an address so it should resolve the relocation
> > at static link-time.
> > 
> > I think the test:
> > 	case BFD_RELOC_AARCH64_16:
> > #if ARCH_SIZE == 64
> > 	case BFD_RELOC_AARCH64_32:
> > #endif
> > 	  if (bfd_link_pic (info)
> > 	      && (sec->flags & SEC_ALLOC) != 0
> > 	      && (sec->flags & SEC_READONLY) != 0)
> >             ... Give error message
> > Should check that the symbol is not absolute as well.
> > 
> > Unfortunately I can't think of a workaround for the case where the value of
> > the symbols has to be in the RO-segment. For some reason the check only
> > applies in RO sections, which does not make a lot of sense to me as a
> > R_AARCH64_ABS32 from a RW section to an address will truncate it in the same
> > way as if it were from a RO section. No dynamic relocation is generated for
> > either RO or RW so I don't know why the distinction has been made.
> 
> Indeed, for a absolute symbol, the assumption that it represents an address
> is not correct.
> A check should be added to allow absolute symbol with R_AARCH64_ABS
> relocation.
> 
> The condition here is to apply the constrain only in allocatable text or
> read-only data section, where I though is more likely to be a place to store
> fixed address.
> 
> I will prepare a patch, trying to fix the absolute symbol case.

Thank you Renlin. May I kindly suggest that you also look at the other issue, which is related?

https://sourceware.org/bugzilla/show_bug.cgi?id=20402

In that case, a runtime relocation is emitted even for SHN_ABS symbols, which means the resulting value becomes dependent on the load address.
Comment 11 Arnd Bergmann 2018-02-05 11:16:36 UTC
I did some more testing with binutils-2.30 on arm64 randconfig kernels. One major issue I came across was CONFIG_MODVERSIONS, which causes an error for each exported symbol.

aarch64-linux-ld: init/main.o: relocation R_AARCH64_ABS32 against `__crc_system_state' can not be used when making a shared object
aarch64-linux-ld: init/version.o: relocation R_AARCH64_ABS32 against `__crc_init_uts_ns' can not be used when making a shared object
aarch64-linux-ld: init/do_mounts.o: relocation R_AARCH64_ABS32 against `__crc_name_to_dev_t' can not be used when making a shared object
aarch64-linux-ld: init/init_task.o: relocation R_AARCH64_ABS32 against `__crc_init_task' can not be used when making a shared object
aarch64-linux-ld: arch/arm64/kernel/fpsimd.o: relocation R_AARCH64_ABS32 against `__crc_kernel_neon_busy' can not be used when making a shared object
aarch64-linux-ld: arch/arm64/kernel/process.o: relocation R_AARCH64_ABS32 against `__crc___stack_chk_guard' can not be used when making a shared object
aarch64-linux-ld: arch/arm64/kernel/stacktrace.o: relocation R_AARCH64_ABS32 against `__crc_save_stack_trace_tsk' can not be used when making a shared object


While changing one of the .S files, I also ran into a linker crash:

 aarch64-linux-ld: arch/arm64/kernel/head.o: relocation R_AARCH64_ABS32 against `__rela_offset' can not be used when making a shared object
11:39 PM cross-gcc/bin/aarch64-linux-ld: BFD (GNU Binutils) 2.30 internal error, aborting at /home/arnd/git/binutils/bfd/elfnn-aarch64.c:5279 in elf64_aarch64_final_link_relocate
11:40 PM aarch64-linux-ld: Please report this bug.

This evidently was caused by the same two instructions that came up earlier:

       ldr     w9, =__rela_offset              // offset to reloc table
       ldr     w10, =__rela_size               // size of reloc table

After commenting these out, I got no further crashes.
Comment 12 Renlin Li 2018-02-05 12:39:50 UTC
Hi all,

Sorry for the break. I sent a patch to the mailing list to fix this issue.
https://sourceware.org/ml/binutils/2018-02/msg00039.html

Could you help to check whether it fixes the problem?
Thanks!
Comment 13 Renlin Li 2018-02-05 12:41:18 UTC
(In reply to Arnd Bergmann from comment #11)
> I did some more testing with binutils-2.30 on arm64 randconfig kernels. One
> major issue I came across was CONFIG_MODVERSIONS, which causes an error for
> each exported symbol.
> 
...
> While changing one of the .S files, I also ran into a linker crash:
> 
>  aarch64-linux-ld: arch/arm64/kernel/head.o: relocation R_AARCH64_ABS32
> against `__rela_offset' can not be used when making a shared object
> 11:39 PM cross-gcc/bin/aarch64-linux-ld: BFD (GNU Binutils) 2.30 internal
> error, aborting at /home/arnd/git/binutils/bfd/elfnn-aarch64.c:5279 in
> elf64_aarch64_final_link_relocate
> 11:40 PM aarch64-linux-ld: Please report this bug.
> 
> This evidently was caused by the same two instructions that came up earlier:
> 
>        ldr     w9, =__rela_offset              // offset to reloc table
>        ldr     w10, =__rela_size               // size of reloc table
> 
> After commenting these out, I got no further crashes.

Hi Arnd,

I could not reproduce the linker crash you described here.
Do you have a minimum testcase for this issues?

Thanks!
Comment 14 cvs-commit@gcc.gnu.org 2018-02-05 18:23:14 UTC
The master branch has been updated by Renlin Li <renlin@sourceware.org>:

https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;h=279b2f94168ee91e02ccd070d27c983fc001fe12

commit 279b2f94168ee91e02ccd070d27c983fc001fe12
Author: Renlin Li <renlin.li@arm.com>
Date:   Sat Feb 3 13:18:17 2018 +0000

    [PR22764][LD][AARCH64]Allow R_AARCH64_ABS16 and R_AARCH64_ABS32 against absolution symbol or undefine symbol in shared object.
    
    The assumption that R_AARCH64_ABS16 and R_AARCH64_ABS32 relocation in LP64 abi
    will be used to generate an address does not hold for absolute symbol.
    In this case, it is a value fixed at static linking time.
    
    The condition to check the relocations is relax to allow absolute symbol and
    undefined symbol case.
    
    bfd/
    
    2018-02-05  Renlin Li  <renlin.li@arm.com>
    
    	PR ld/22764
    	* elfnn-aarch64.c (elfNN_aarch64_check_relocs): Relax the
    	R_AARCH64_ABS32 and R_AARCH64_ABS16 for absolute symbol. Apply the
    	check for writeable section as well.
    
    ld/
    
    2018-02-05  Renlin Li  <renlin.li@arm.com>
    
    	PR ld/22764
    	* testsuite/ld-aarch64/emit-relocs-258.s: Define symbol as an address.
    	* testsuite/ld-aarch64/emit-relocs-259.s: Likewise.
    	* testsuite/ld-aarch64/aarch64-elf.exp: Run new test.
    	* testsuite/ld-aarch64/pr22764.s: New.
    	* testsuite/ld-aarch64/pr22764.d: New.
Comment 15 cvs-commit@gcc.gnu.org 2018-02-05 18:34:02 UTC
The binutils-2_30-branch branch has been updated by Renlin Li <renlin@sourceware.org>:

https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;h=b01452b1d44a586f4ecf5cf02ffc0643e4962324

commit b01452b1d44a586f4ecf5cf02ffc0643e4962324
Author: Renlin Li <renlin.li@arm.com>
Date:   Sat Feb 3 13:18:17 2018 +0000

    [PR22764][LD][AARCH64]Allow R_AARCH64_ABS16 and R_AARCH64_ABS32 against absolution symbol or undefine symbol in shared object.
    
    backport from mainline
    
    bfd/
    
    2018-02-05  Renlin Li  <renlin.li@arm.com>
    
    	PR ld/22764
    	* elfnn-aarch64.c (elfNN_aarch64_check_relocs): Relax the
    	R_AARCH64_ABS32 and R_AARCH64_ABS16 for absolute symbol. Apply the
    	check for writeable section as well.
    
    ld/
    
    2018-02-05  Renlin Li  <renlin.li@arm.com>
    
    	PR ld/22764
    	* testsuite/ld-aarch64/emit-relocs-258.s: Define symbol as an address.
    	* testsuite/ld-aarch64/emit-relocs-259.s: Likewise.
    	* testsuite/ld-aarch64/aarch64-elf.exp: Run new test.
    	* testsuite/ld-aarch64/pr22764.s: New.
    	* testsuite/ld-aarch64/pr22764.d: New.