Bug 19807 - [2.26 regression] R_386_GOT32X optimization breaks linux kernel
Summary: [2.26 regression] R_386_GOT32X optimization breaks linux kernel
Status: RESOLVED FIXED
Alias: None
Product: binutils
Classification: Unclassified
Component: ld (show other bugs)
Version: 2.27
: P2 critical
Target Milestone: 2.27
Assignee: Not yet assigned to anyone
URL:
Keywords:
Depends on: 19827
Blocks:
  Show dependency treegraph
 
Reported: 2016-03-11 08:09 UTC by Fabian Vogt
Modified: 2016-03-17 03:02 UTC (History)
3 users (show)

See Also:
Host:
Target:
Build:
Last reconfirmed:


Attachments
A kernel patch (1013 bytes, patch)
2016-03-11 21:35 UTC, H.J. Lu
Details | Diff
An updated patch (1.11 KB, patch)
2016-03-11 21:38 UTC, H.J. Lu
Details | Diff
An updated patch to pass --no-dynamic-linker to linker (1.12 KB, patch)
2016-03-11 22:11 UTC, H.J. Lu
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Fabian Vogt 2016-03-11 08:09:39 UTC
Building the i386 linux kernel (tried the latest 4.5 rc7, but all are broken) with compression causes booting to fail during the decompression stage:
"Failed to allocate space for phdrs".

The kernel decompressor stage is linked to a fixed address, but running at a different location.
This isn't a problem as it relocates itself by adding the difference to the GOT entries,
but the optimization "mov $GOTOFF(%ecx), %eax" (R_386_GOT32X) -> "lea $address, %eax" (R_386_32) breaks this,
as the GOT is no longer referenced in the final ELF.
Result is that some global variables, like free_mem_ptr_end, are overwritten during decompression, causing weird errors like
malloc returing NULL.

Workaround: specify "--disable-x86-relax-relocations" configure option or "-mrelax-relocations=no" to as.

See also:
https://bugzilla.opensuse.org/show_bug.cgi?id=970239
https://bugzilla.redhat.com/show_bug.cgi?id=1302071
Comment 1 H.J. Lu 2016-03-11 12:55:53 UTC
Kernel build process should pass -pie to linker so that the image
can be loaded at different address.
Comment 2 Fabian Vogt 2016-03-11 12:59:20 UTC
I tried that but it crashed even before the decompressor.
I'll have a closer look at that and report back.
Comment 3 H.J. Lu 2016-03-11 21:35:17 UTC
Created attachment 9087 [details]
A kernel patch

Try this.
Comment 4 H.J. Lu 2016-03-11 21:38:31 UTC
Created attachment 9088 [details]
An updated patch
Comment 5 H.J. Lu 2016-03-11 22:11:39 UTC
Created attachment 9089 [details]
An updated patch to pass --no-dynamic-linker to linker
Comment 6 Fabian Vogt 2016-03-11 22:50:46 UTC
(In reply to H.J. Lu from comment #5)
> Created attachment 9089 [details]
> An updated patch to pass --no-dynamic-linker to linker

Applied and tested with 4.5rc7. Result: Boots fine!
Comment 7 H.J. Lu 2016-03-11 22:54:54 UTC
[hjl@gnu-6 ld-x86-64]$ cat pr19807a.s
	.globl  _start
	.type	_start, @function
_start:
	movq	$foo, %rax
	.size	_start, .-_start
[hjl@gnu-6 ld-x86-64]$ cat pr19807b.s
	.globl	foo
	.type	foo, @object
	foo = 0x145e000
[hjl@gnu-6 ld-x86-64]$ 
...
hjl@gnu-6 ld]$ ./ld-new -pie pr19807a.o pr19807b.o
./ld-new: pr19807a.o: relocation R_X86_64_32S against `foo' can not be used when making a shared object; recompile with -fPIC
pr19807a.o: error adding symbols: Bad value
[hjl@gnu-6 ld]$
Comment 8 H.J. Lu 2016-03-11 23:29:52 UTC
I will fix binutils to support -pie in 64-bit mode.
Comment 9 Fabian Vogt 2016-03-14 10:52:10 UTC
I just tested your patch again on a clean kernel tree, it seems that my previous test was somehow wrong.

Using "-pie" as flag to LD causes

         leal    (_bss-4)(%ebp), %esi
         leal    (_bss-4)(%ebx), %edi

in arch/x86/boot/compressed/head_32.S to be

         leal    -0x4(%ebp), %esi
         leal    -0x4(%ebx), %edi

instead of the expected

         leal    0x59a27c(%ebp),%esi
         leal    0x59a27c(%ebx),%edi

so the kernel does not copy itself correctly.
Comment 10 Fabian Vogt 2016-03-14 11:28:33 UTC
(In reply to Fabian Vogt from comment #9)
> I just tested your patch again on a clean kernel tree, it seems that my
> previous test was somehow wrong.

Actually, I just tried with binutils master instead of 2.26 again and it works.
Is there anything that can be used to get this working with 2.26?
Comment 11 H.J. Lu 2016-03-14 11:48:14 UTC
(In reply to Fabian Vogt from comment #10)
> (In reply to Fabian Vogt from comment #9)
> > I just tested your patch again on a clean kernel tree, it seems that my
> > previous test was somehow wrong.
> 
> Actually, I just tried with binutils master instead of 2.26 again and it
> works.
> Is there anything that can be used to get this working with 2.26?

Did you mean binutils 2.26 branch or 2.26 release?
Comment 12 Fabian Vogt 2016-03-14 12:07:35 UTC
(In reply to H.J. Lu from comment #11)
> (In reply to Fabian Vogt from comment #10)
> > (In reply to Fabian Vogt from comment #9)
> > > I just tested your patch again on a clean kernel tree, it seems that my
> > > previous test was somehow wrong.
> > 
> > Actually, I just tried with binutils master instead of 2.26 again and it
> > works.
> > Is there anything that can be used to get this working with 2.26?
> 
> Did you mean binutils 2.26 branch or 2.26 release?

> ld -v
GNU ld (GNU Binutils; openSUSE Factory) 2.26.0.20160229-1

which is the 2.26 branch up to 4eb4e2ad76eb4bf5a2a2c20b2a2ec382fdbb3e2f.
Comment 13 H.J. Lu 2016-03-14 13:14:00 UTC
(In reply to Fabian Vogt from comment #12)
> > ld -v
> GNU ld (GNU Binutils; openSUSE Factory) 2.26.0.20160229-1
> 
> which is the 2.26 branch up to 4eb4e2ad76eb4bf5a2a2c20b2a2ec382fdbb3e2f.

Please try the current binutils-2_26-branch branch. If it doesn't work,
please show me the difference in vmlinux between master and 2.26 branch.
Comment 14 Fabian Vogt 2016-03-14 13:53:22 UTC
I could reproduce the issue with

    GNU ld (GNU Binutils) 2.26.0.20160314

as well. For the difference, see comment 9.
Comment 15 H.J. Lu 2016-03-15 01:11:31 UTC
(In reply to Fabian Vogt from comment #14)
> I could reproduce the issue with
> 
>     GNU ld (GNU Binutils) 2.26.0.20160314
> 
> as well. For the difference, see comment 9.

Please try users/hjl/dynamic/binutils-2_26-branch branch.
Comment 16 Fabian Vogt 2016-03-15 08:05:22 UTC
(In reply to H.J. Lu from comment #15)
> (In reply to Fabian Vogt from comment #14)
> > I could reproduce the issue with
> > 
> >     GNU ld (GNU Binutils) 2.26.0.20160314
> > 
> > as well. For the difference, see comment 9.
> 
> Please try users/hjl/dynamic/binutils-2_26-branch branch.

Works fine.
Comment 17 H.J. Lu 2016-03-15 15:20:08 UTC
(In reply to Fabian Vogt from comment #9)
> I just tested your patch again on a clean kernel tree, it seems that my
> previous test was somehow wrong.
> 
> Using "-pie" as flag to LD causes
> 
>          leal    (_bss-4)(%ebp), %esi
>          leal    (_bss-4)(%ebx), %edi
> 
> in arch/x86/boot/compressed/head_32.S to be
> 
>          leal    -0x4(%ebp), %esi
>          leal    -0x4(%ebx), %edi
> 
> instead of the expected
> 
>          leal    0x59a27c(%ebp),%esi
>          leal    0x59a27c(%ebx),%edi
> 
> so the kernel does not copy itself correctly.

I opened:

https://sourceware.org/bugzilla/show_bug.cgi?id=19827
Comment 18 Sourceware Commits 2016-03-15 18:09:13 UTC
The master branch has been updated by H.J. Lu <hjl@sourceware.org>:

https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;h=4c10bbaa0912742322f10d9d5bb630ba4e15dfa7

commit 4c10bbaa0912742322f10d9d5bb630ba4e15dfa7
Author: H.J. Lu <hjl.tools@gmail.com>
Date:   Tue Mar 15 11:07:06 2016 -0700

    Add -z noreloc-overflow option to x86-64 ld
    
    Add -z noreloc-overflow command-line option to the x86-64 ELF linker to
    disable relocation overflow check.  This can be used to avoid relocation
    overflow check if there will be no dynamic relocation overflow at
    run-time.
    
    bfd/
    
    	PR ld/19807
    	* elf64-x86-64.c (elf_x86_64_relocate_section): Check
    	no_reloc_overflow_check to diable R_X86_64_32/R_X86_64_32S
    	relocation overflow check.
    
    include/
    
    	PR ld/19807
    	* bfdlink.h (bfd_link_info): Add no_reloc_overflow_check.
    
    ld/
    
    	PR ld/19807
    	* Makefile.am (ELF_X86_DEPS): Add
    	$(srcdir)/emulparams/reloc_overflow.sh.
    	* Makefile.in: Regenerated.
    	* NEWS: Mention -z noreloc-overflow.
    	* ld.texinfo: Document -z noreloc-overflow.
    	* emulparams/elf32_x86_64.sh: Source
    	${srcdir}/emulparams/reloc_overflow.sh.
    	* emulparams/elf_x86_64.sh: Likewise.
    	* emulparams/reloc_overflow.sh: New file.
    	* testsuite/ld-x86-64/pr19807-1.s: New file.
    	* testsuite/ld-x86-64/pr19807-1a.d: Likewise.
    	* testsuite/ld-x86-64/pr19807-1b.d: Likewise.
    	* testsuite/ld-x86-64/pr19807-2.s: Likewise.
    	* testsuite/ld-x86-64/pr19807-2a.d: Likewise.
    	* testsuite/ld-x86-64/pr19807-2b.d: Likewise.
    	* testsuite/ld-x86-64/pr19807-2c.d: Likewise.
    	* testsuite/ld-x86-64/pr19807-2d.d: Likewise.
    	* testsuite/ld-x86-64/pr19807-2e.d: Likewise.
    	* testsuite/ld-x86-64/x86-64.exp: Run PR ld/19807 tests.
Comment 19 H.J. Lu 2016-03-15 19:16:24 UTC
(In reply to Fabian Vogt from comment #16)
> (In reply to H.J. Lu from comment #15)
> > (In reply to Fabian Vogt from comment #14)
> > > I could reproduce the issue with
> > > 
> > >     GNU ld (GNU Binutils) 2.26.0.20160314
> > > 
> > > as well. For the difference, see comment 9.
> > 
> > Please try users/hjl/dynamic/binutils-2_26-branch branch.
> 
> Works fine.

Please try users/hjl/pr19827/binutils-2_26-branch branch.

I opened a kernel bug:

https://bugzilla.kernel.org/show_bug.cgi?id=114671

for 32-bit x86 kernel in PIE.  64-bit x86-64 kernel in PIE is fixed by
-z noreloc-overflow linker option.
Comment 20 Richard Biener 2016-03-16 08:17:48 UTC
Hum, so we leave 2.26 broken?
Comment 21 Fabian Vogt 2016-03-16 08:36:22 UTC
(In reply to Richard Guenther from comment #20)
> Hum, so we leave 2.26 broken?

The 2.26 release is not broken, "only" the binutils-2_26-branch.

Reopening to request clarification whether and how that can be solved.
Comment 22 rguenther 2016-03-16 09:07:03 UTC
On Wed, 16 Mar 2016, fvogt at suse dot com wrote:

> https://sourceware.org/bugzilla/show_bug.cgi?id=19807
> 
> Fabian Vogt <fvogt at suse dot com> changed:
> 
>            What    |Removed                     |Added
> ----------------------------------------------------------------------------
>              Status|RESOLVED                    |REOPENED
>          Resolution|FIXED                       |---
> 
> --- Comment #21 from Fabian Vogt <fvogt at suse dot com> ---
> (In reply to Richard Guenther from comment #20)
> > Hum, so we leave 2.26 broken?
> 
> The 2.26 release is not broken, "only" the binutils-2_26-branch.

True.

> Reopening to request clarification whether and how that can be solved.

A possibility is to revert backporting of the new relocation support.
Comment 23 H.J. Lu 2016-03-16 11:46:19 UTC
(In reply to Richard Guenther from comment #20)
> Hum, so we leave 2.26 broken?

Please try users/hjl/pr19827/binutils-2_26-branch branch.
Comment 24 Fabian Vogt 2016-03-16 13:13:29 UTC
(In reply to H.J. Lu from comment #23)
> (In reply to Richard Guenther from comment #20)
> > Hum, so we leave 2.26 broken?
> 
> Please try users/hjl/pr19827/binutils-2_26-branch branch.

Tested and boots.
Comment 25 H.J. Lu 2016-03-16 13:37:42 UTC
(In reply to Fabian Vogt from comment #24)
> (In reply to H.J. Lu from comment #23)
> > (In reply to Richard Guenther from comment #20)
> > > Hum, so we leave 2.26 broken?
> > 
> > Please try users/hjl/pr19827/binutils-2_26-branch branch.
> 
> Tested and boots.

I will backport it to 2.26 branch in the next few days.
Comment 26 H.J. Lu 2016-03-17 03:01:41 UTC
I uploaded a new kernel patch which should work with all linkers:

https://bugzilla.kernel.org/show_bug.cgi?id=114671