Bug 21491 - --fix-cortex-a53-843419 Errata workaround can produce broken images
Summary: --fix-cortex-a53-843419 Errata workaround can produce broken images
Status: RESOLVED FIXED
Alias: None
Product: binutils
Classification: Unclassified
Component: gold (show other bugs)
Version: unspecified
: P2 normal
Target Milestone: ---
Assignee: Han Shen
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2017-05-15 10:48 UTC by Peter Smith
Modified: 2017-07-12 14:34 UTC (History)
3 users (show)

See Also:
Host:
Target:
Build:
Last reconfirmed: 2017-05-15 00:00:00


Attachments
Test case to reproduce the --ffix-cortex-843419 missing immediate (1.24 KB, application/zip)
2017-05-15 10:48 UTC, Peter Smith
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Peter Smith 2017-05-15 10:48:02 UTC
Created attachment 10051 [details]
Test case to reproduce the --ffix-cortex-843419 missing immediate

The --fix-cortex-a53-843419 errata workaround extracts a ldr or str instruction of the form ld/st xt, [xn, #uimm] into an erratum stub. In certain circumstances the erratum stub extracts the instruction before it has been relocated leading to a ldr/str instruction in the stub with a 0 immediate, whereas if the patch were not applied the immediate would be non-0. This is likely to incorrect program, in our case a segfault at runtime.

The root cause of the problem is when the erratum stubs for an output section are assigned to an input section from an object that is relocated before the object that contains the input section that needs the errata fix. In this case the stub table is relocated when the "owning" section is relocated, which can be before the section needing the patch is relocated (which updates the stub table) so the stub table contains the unrelocated instruction bit pattern.

This is most likely (and how we encountered it) with LTO as the ELF object that comes back from the plugin is added last and has its relocations processed last. If the .text section from the LTO object needs a patch and the stub table "owner" is not from that LTO object then we observe the problem.

I've attached a modification of the ld.bfd test erratum843419.s (add an immediate to extracted instruction) that can trigger this problem with a somewhat contrived linker script. Check the disassembly with the fix applied against the patch without the fix applied
Original no fix:
 2001004:	f9008008 	str	x8, [x0,#256]
With fix:
 2001004:	14000008 	b	2001024 <fn+0x4>
...
0000000002001020 <fn>:
 2001020:	d65f03c0 	ret
 2001024:	f9000008 	str	x8, [x0]
 2001028:	17fffff8 	b	2001008 <e843419_1+0x10>

Note that the #256 has been lost.

I think that this is important as the android NDK enables the --fix-cortex-a53-843419 by default so any use of LTO within android is at risk.

I suspect that this may be the root cause of https://sourceware.org/bugzilla/show_bug.cgi?id=21062   

A possible workaround for LTO builds is to use a linker script and put the .text section from LTO into its own output section to make sure the owner of the stub table is the LTO object.
Comment 1 Sourceware Commits 2017-07-11 18:19:52 UTC
The master branch has been updated by Han Shen <shenhan@sourceware.org>:

https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;h=df2f63a6a0fc19c54e58aab8afe262baf3cb1a3c

commit df2f63a6a0fc19c54e58aab8afe262baf3cb1a3c
Author: Han Shen <shenhan@google.com>
Date:   Mon Jul 10 15:23:05 2017 -0700

    Fixing for PR gold/21491 - Errata workaround can produce broken images.
    
    The problem is caused by the fact that gold is relocating the stubs
    for an entire output section when it processes the relocations for a
    particular input section that happened to be designated as the stub
    table "owner". The Relocate_task for that input section may or may not
    run before the Relocate_task for another input section that contains
    the code that needs the erratum fix, but doesn't "own" the stub
    table. If it runs before (or might even race with) that other task, it
    ends up with a copy of the unrelocated original instruction.
    
    In other words - when calling fix_errata() from
    do_relocate_sections(), gold is going through the list of errata stubs
    that are associated only with that object. This routine updates the
    stored original instruction and replaces it in the output view with a
    branch to the stub. Later, as gold is going through the object file's
    input sections, it then checks for stub tables "owned" by each input
    section, and writes out all the stubs from that stub table, regardless
    of what object file each stub is associated with.
    
    Fixed by relocating the erratum stub only after the corresponding
    errata spot is fixed. That is to have fix_errata() call
    Stub_table::relocate_erratum_stub() for each stub.
    
    gold/ChangeLog
    2017-07-06  Han Shen  <shenhan@google.com>
    
    	PR gold/21491
    
    	* aarch64.cc (Erratum_stub::invalidate_erratum_stub): New method.
    	(Erratum_stub::is_invalidated_erratum_stub): New method.
    	(Stub_table::relocate_reloc_stub): Renamed from "relocate_stub".
    	(Stub_table::relocate_reloc_stubs): Renamed from "relocate_stubs".
    	(Stub_table::relocate_erratum_stub): New method.
    	(AArch64_relobj::fix_errata_and_relocate_erratum_stubs): Renamed from
    	"fix_errata".
    	(Target_aarch64::relocate_reloc_stub): Renamed from "relocate_stub".
Comment 2 Cary Coutant 2017-07-11 19:05:32 UTC
Fixed on trunk. Does this need to be backported?
Comment 3 Han Shen 2017-07-11 19:45:35 UTC
(In reply to Cary Coutant from comment #2)
> Fixed on trunk. Does this need to be backported?

Yes, considering the severity of the bug.
Comment 4 Sourceware Commits 2017-07-12 14:29:12 UTC
The binutils-2_29-branch branch has been updated by Cary Coutant <ccoutant@sourceware.org>:

https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;h=d3dc405d0d3e106ef75dd126b9c03b463af6f51b

commit d3dc405d0d3e106ef75dd126b9c03b463af6f51b
Author: Han Shen <shenhan@google.com>
Date:   Mon Jul 10 15:23:05 2017 -0700

    Fixing for PR gold/21491 - Errata workaround can produce broken images.
    
    The problem is caused by the fact that gold is relocating the stubs
    for an entire output section when it processes the relocations for a
    particular input section that happened to be designated as the stub
    table "owner". The Relocate_task for that input section may or may not
    run before the Relocate_task for another input section that contains
    the code that needs the erratum fix, but doesn't "own" the stub
    table. If it runs before (or might even race with) that other task, it
    ends up with a copy of the unrelocated original instruction.
    
    In other words - when calling fix_errata() from
    do_relocate_sections(), gold is going through the list of errata stubs
    that are associated only with that object. This routine updates the
    stored original instruction and replaces it in the output view with a
    branch to the stub. Later, as gold is going through the object file's
    input sections, it then checks for stub tables "owned" by each input
    section, and writes out all the stubs from that stub table, regardless
    of what object file each stub is associated with.
    
    Fixed by relocating the erratum stub only after the corresponding
    errata spot is fixed. That is to have fix_errata() call
    Stub_table::relocate_erratum_stub() for each stub.
    
    gold/ChangeLog
    2017-07-06  Han Shen  <shenhan@google.com>
    
    	PR gold/21491
    
    	* aarch64.cc (Erratum_stub::invalidate_erratum_stub): New method.
    	(Erratum_stub::is_invalidated_erratum_stub): New method.
    	(Stub_table::relocate_reloc_stub): Renamed from "relocate_stub".
    	(Stub_table::relocate_reloc_stubs): Renamed from "relocate_stubs".
    	(Stub_table::relocate_erratum_stub): New method.
    	(AArch64_relobj::fix_errata_and_relocate_erratum_stubs): Renamed from
    	"fix_errata".
    	(Target_aarch64::relocate_reloc_stub): Renamed from "relocate_stub".
Comment 5 Sourceware Commits 2017-07-12 14:34:31 UTC
The binutils-2_28-branch branch has been updated by Cary Coutant <ccoutant@sourceware.org>:

https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;h=e70c0b5f07daf63474e805f8d0a45152fcd60468

commit e70c0b5f07daf63474e805f8d0a45152fcd60468
Author: Han Shen <shenhan@google.com>
Date:   Mon Jul 10 15:23:05 2017 -0700

    Fixing for PR gold/21491 - Errata workaround can produce broken images.
    
    The problem is caused by the fact that gold is relocating the stubs
    for an entire output section when it processes the relocations for a
    particular input section that happened to be designated as the stub
    table "owner". The Relocate_task for that input section may or may not
    run before the Relocate_task for another input section that contains
    the code that needs the erratum fix, but doesn't "own" the stub
    table. If it runs before (or might even race with) that other task, it
    ends up with a copy of the unrelocated original instruction.
    
    In other words - when calling fix_errata() from
    do_relocate_sections(), gold is going through the list of errata stubs
    that are associated only with that object. This routine updates the
    stored original instruction and replaces it in the output view with a
    branch to the stub. Later, as gold is going through the object file's
    input sections, it then checks for stub tables "owned" by each input
    section, and writes out all the stubs from that stub table, regardless
    of what object file each stub is associated with.
    
    Fixed by relocating the erratum stub only after the corresponding
    errata spot is fixed. That is to have fix_errata() call
    Stub_table::relocate_erratum_stub() for each stub.
    
    gold/ChangeLog
    2017-07-06  Han Shen  <shenhan@google.com>
    
    	PR gold/21491
    
    	* aarch64.cc (Erratum_stub::invalidate_erratum_stub): New method.
    	(Erratum_stub::is_invalidated_erratum_stub): New method.
    	(Stub_table::relocate_reloc_stub): Renamed from "relocate_stub".
    	(Stub_table::relocate_reloc_stubs): Renamed from "relocate_stubs".
    	(Stub_table::relocate_erratum_stub): New method.
    	(AArch64_relobj::fix_errata_and_relocate_erratum_stubs): Renamed from
    	"fix_errata".
    	(Target_aarch64::relocate_reloc_stub): Renamed from "relocate_stub".