This is the mail archive of the
binutils@sourceware.org
mailing list for the binutils project.
[COMMITTED] Fix alpha-elf relaxation
- From: Richard Henderson <rth at redhat dot com>
- To: Binutils <binutils at sourceware dot org>
- Date: Mon, 21 Apr 2014 08:57:04 -0700
- Subject: [COMMITTED] Fix alpha-elf relaxation
- Authentication-results: sourceware.org; auth=none
Relaxation has been broken for quite some time, primarily affecting large
programs. The symptom is that relaxation creates a bunch of GPREL16
relocations, which turn out to be out-of-range, and the linker errors out.
Most folks simply turn off relaxation at this point and move on. Indeed, I
believe the only remaining distribution supporting alpha (gentoo) turns off
relaxation by default.
Alpha has a multi-got scheme where each input file is given a .got subsection
which can contain 64k of symbols. We merge .got subsections between object
files until they reach 64k, both to minimize redundancy in relocations, but
also to optimize calls between input files. If a caller and callee are close
enough, and share a .got subsection, then we can optimize to a direct branch.
Which eliminates a load and indirect branch on the caller side and computation
of the gp on the callee side. It may also eliminate the slot in the .got
subsection, which in turn reduces the size of the got and may allow for more
subsection merging and then to more call optimization. Similarly, we also try
to replace loading an address from the .got with a direct displacement from the
gp register, which can also eliminate a slot and enable merging.
This is all well and good, except when we try to do too many things at once.
Suppose we have two .got subsections at the end of the .got:
----+----------------------+--------------------+--------------------
... | large subsection m | small subsection n | data ... variable x
----+----------------------+--------------------+--------------------
Suppose section N has a reference to X. It's within 32k of the start of
section N, so we optimize the reference to a GPREL16 reloc. Suppose just
enough elimination is done so that subsections M and N can be merged. However,
the combination of M+N is larger than N alone, so the displacement of X from
the start of M is larger than 32k, and the newly created relocation is now out
of range.
The solution is to split the relaxation into two passes. In the first pass,
eliminate everything we can that does not involve the creation of GPREL relocs.
This is primarily TLS and call relaxation. Since most functions are simply
called and not addressed directly (weak functions excepted), this works
well to eliminate those slots. In the second pass, we do all of things that
would create GPREL relocs, but we disable .got subsection merging. That way, X
can only move closer to the start of N as we eliminate slots, eliminating the
case that caused displacement growth.
Tested with a gcc bootstrap, which triggered the problem quite easily in
cc1plus and f951, and committed.
r~
PS: Relaxation of cc1plus:
Enabled:
[ 8] .rela.dyn RELA 00000001200bb928 000bb928
0000000000000750 0000000000000018 A 4 0 8
[ 9] .rela.plt RELA 00000001200bc078 000bc078
0000000000001008 0000000000000018 AI 4 11 8
[25] .got PROGBITS 0000000121031a90 01021a90
000000000000b0b0 0000000000000000 WA 0 0 8
Disabled:
[ 8] .rela.dyn RELA 00000001200b8828 000b8828
0000000000000990 0000000000000018 A 4 0 8
[ 9] .rela.plt RELA 00000001200b91b8 000b91b8
0000000000001b18 0000000000000018 AI 4 11 8
[23] .got PROGBITS 0000000121018a48 01008a48
0000000000022f80 0000000000000000 WA 0 0 8
a savings of 98k in the complete got section.