This is the mail archive of the
binutils@sourceware.org
mailing list for the binutils project.
Re: [gold][aarch64] Patch for erratum-843419 (1 of 2 - report erratum occurrences)
- From: HÃn ShÄn (ææ) <shenhan at google dot com>
- To: Cary Coutant <ccoutant at gmail dot com>
- Cc: binutils <binutils at sourceware dot org>, Jing Yu <jingyu at google dot com>, Doug Kwan <dougkwan at google dot com>, Luis Lozano <llozano at google dot com>, Bhaskar <bjanakiraman at google dot com>, Andrew Hsieh <andrewhsieh at google dot com>, Ramana Radhakrishnan <ramana dot gcc at googlemail dot com>
- Date: Thu, 16 Apr 2015 13:46:14 -0700
- Subject: Re: [gold][aarch64] Patch for erratum-843419 (1 of 2 - report erratum occurrences)
- Authentication-results: sourceware.org; auth=none
- References: <CACkGtrhuUi7DtTQ89khOARvV3SMgyHdSea8EAMqioWT5Kh4R9g at mail dot gmail dot com> <CAJimCsE5vpge_UqaYCqGgTrr5OOvs_a4eQUS9nrrkpnPeQR+2w at mail dot gmail dot com>
Hi Cary, thanks a lot for the review. Some comments below.
Re-run and passed all the tests.
For the ease of review, I attached 2 diffs, one is against trunk, the
other is against my previous patch.
Thanks,
Han
On Wed, Apr 15, 2015 at 3:41 PM, Cary Coutant <ccoutant@gmail.com> wrote:
> + // "<" comparator used in ordered_map container.
> + bool
> + operator < (const Mapping_symbol_position& p) const
>
> For consistency, please write this as "operator<(", without spaces
> around the '<'.
Done.
>
> + // Only erratum-fixing work needs mapping symbols, so skip this
> time consuming
> + // processing if not fixing erratum.
> + if (!parameters->options().fix_cortex_a53())
> + return ;
>
> Extra space.
Done.
>
> + // Read the symbol table section header.
> + const unsigned int symtab_shndx = this->symtab_shndx();
> + elfcpp::Shdr<size, big_endian>
> + symtabshdr(this, this->elf_file()->section_header(symtab_shndx));
> + gold_assert(symtabshdr.get_sh_type() == elfcpp::SHT_SYMTAB);
>
> It seems wasteful to set up these views of the symbol table, when
> do_count_local_symbols in the parent class has already done all that.
> Both arm and mips targets also do something like this, so a refactor
> of the code could benefit all three targets. I won't ask you to do this,
> but I'll make a note to look at this myself later.
Thanks!
>
> + // Read the local symbols.
> + const int sym_size =elfcpp::Elf_sizes<size>::sym_size;
>
> Space after '='.
>
> + while (p != this->mapping_symbol_info_.end() &&
> + p->first.shndx_ == shndx)
> + {
> + typename Mapping_symbol_info::const_iterator next =
> + this->mapping_symbol_info_.upper_bound(p->first);
>
> Isn't this equivalent to "next = p; ++next"? With map, you'll only
> have a single entry per position, and you know that p->first is
> already in the map. It seems to me that you're doing an unnecessary
> binary search for every element of the map.
Yes, exactly, this is an unnecessary binary search.
>
> I'd suggest doing something like:
>
> typename Mapping_symbol_info::const_iterator prev = p;
> ++p;
>
> At the top of the loop, remove "p = next;" at the bottom, then replace
> "p" with "prev" and "next" with "p" in the body of the loop.
Done, yes, that's much clearer.
>
> + // The 3rd insn is a load or store instruction from the "Load/store
> + // register (unsigned immediate)" encoding class, using Rn as the
> + // base address register.
> + if (Insn_utilities::aarch64_ldst_uimm(insn3) &&
> + Insn_utilities::aarch64_rn(insn3)
> + == Insn_utilities::aarch64_rd(insn1))
>
> Add parentheses around the '==' operation, and indent accordingly.
Done.
>
> + // Erratum sequence is at least 3 insns.
> + if (span_end - span_start < 3 * Insn_utilities::BYTES_PER_INSN)
> + return ;
>
> Extra space. This test is redundant -- you'll drop out of the for loop
> below immediately if the span isn't long enough. It's also wrong -- it
> will fail if there are exactly 3 instructions before the end.
Removed this part.
>
> + Insntype* ip = reinterpret_cast<Insntype*>(input_view + span_start);
> + unsigned int insn_index = 0;
> + for (section_size_type i = span_start; i < span_end;
> + i += Insn_utilities::BYTES_PER_INSN, ++insn_index)
> + {
> + if (span_end - i < 3 * Insn_utilities::BYTES_PER_INSN)
> + break;
>
> Should be "<=".
Done.
>
> Given the constraints on the address, it's inefficient to loop over
> every address. I'd suggest instead something like this:
>
> if (output_address & 0x03 != 0)
'output_address' is output section's output_address, so I use
'output_address + span_start' here.
> return;
> section_size_type offset = 0;
> section_size_type span_length = span_end - span_start;
> // The first instruction must be at page offset 0xFF8 or 0xFFC.
> unsigned int page_offset = output_address & 0xfff;
Same as above, 'output_address' is output section's output_address, so
I use 'output_address + span_start' here.
If there a consensus/guideline on whether to use '0xfff' or '0xFFF'?
> if (page_offset < 0xff8)
> offset = 0xff8 - page_offset;
> while (offset + 3 * Insn_utilities::BYTES_PER_INSN <= span_length)
> {
> Insntype* ip = reinterpret_cast<Insntype*>(input_view + offset);
> Insntype insn1 = ip[0];
> if (!Insn_utilities::is_adrp(insn1))
> continue;
Need to update offset before continue, so I reverse the condition and
wrap everything below inside.
> Insntype insn2 = ip[1];
> Insntype insn3 = ip[2];
>
> // ...
>
> // Advance to next candidate instruction.
> // We only consider instruction sequences starting at
> // a page offset of 0xff8 or 0xffc.
> page_offset = (output_address + offset) & 0xfff;
I use 'output_address + span_start + offset' here.
> if (page_offset == 0xff8)
> offset += 4;
> else // (page_offset == 0xffc)
> offset += 0xffc;
> }
>
> (I used '4' instead of Insn_utilities::BYTES_PER_INSN here, because we
> explicitly want to advance from 0xff8 to 0xffc.)
>
> + if (is_erratum_843419_sequence(insn1, insn2, insn3))
> + do_report = true;
> + else if (span_end - i > 16)
>
> Should be '>='. Do you want "4 * Insn_utilities::BYTES_PER_INSN"
> instead of 16? (I'm not sold on the usefulness of the symbolic constant;
> it seems we're pretty hard-wired to a value of 4.)
>
> + return ;
Done.
>
> Extra space.
>
> + DEFINE_bool(fix_cortex_a53, options::TWO_DASHES, '\0', false,
> + N_("(AArch64 only) Scan and fix binaries for Cortex-A53 erratums."),
> + N_("(AArch64 only) Do not scan for Cortex-A53 erratums."));
>
> The plural of "erratum" is "errata".
Done.
>
> -cary
--
Han Shen
On Wed, Apr 15, 2015 at 3:41 PM, Cary Coutant <ccoutant@gmail.com> wrote:
> + // "<" comparator used in ordered_map container.
> + bool
> + operator < (const Mapping_symbol_position& p) const
>
> For consistency, please write this as "operator<(", without spaces
> around the '<'.
>
> + // Only erratum-fixing work needs mapping symbols, so skip this
> time consuming
> + // processing if not fixing erratum.
> + if (!parameters->options().fix_cortex_a53())
> + return ;
>
> Extra space.
>
> + // Read the symbol table section header.
> + const unsigned int symtab_shndx = this->symtab_shndx();
> + elfcpp::Shdr<size, big_endian>
> + symtabshdr(this, this->elf_file()->section_header(symtab_shndx));
> + gold_assert(symtabshdr.get_sh_type() == elfcpp::SHT_SYMTAB);
>
> It seems wasteful to set up these views of the symbol table, when
> do_count_local_symbols in the parent class has already done all that.
> Both arm and mips targets also do something like this, so a refactor
> of the code could benefit all three targets. I won't ask you to do this,
> but I'll make a note to look at this myself later.
>
> + // Read the local symbols.
> + const int sym_size =elfcpp::Elf_sizes<size>::sym_size;
>
> Space after '='.
>
> + while (p != this->mapping_symbol_info_.end() &&
> + p->first.shndx_ == shndx)
> + {
> + typename Mapping_symbol_info::const_iterator next =
> + this->mapping_symbol_info_.upper_bound(p->first);
>
> Isn't this equivalent to "next = p; ++next"? With map, you'll only
> have a single entry per position, and you know that p->first is
> already in the map. It seems to me that you're doing an unnecessary
> binary search for every element of the map.
>
> I'd suggest doing something like:
>
> typename Mapping_symbol_info::const_iterator prev = p;
> ++p;
>
> At the top of the loop, remove "p = next;" at the bottom, then replace
> "p" with "prev" and "next" with "p" in the body of the loop.
>
> + // The 3rd insn is a load or store instruction from the "Load/store
> + // register (unsigned immediate)" encoding class, using Rn as the
> + // base address register.
> + if (Insn_utilities::aarch64_ldst_uimm(insn3) &&
> + Insn_utilities::aarch64_rn(insn3)
> + == Insn_utilities::aarch64_rd(insn1))
>
> Add parentheses around the '==' operation, and indent accordingly.
>
> + // Erratum sequence is at least 3 insns.
> + if (span_end - span_start < 3 * Insn_utilities::BYTES_PER_INSN)
> + return ;
>
> Extra space. This test is redundant -- you'll drop out of the for loop
> below immediately if the span isn't long enough. It's also wrong -- it
> will fail if there are exactly 3 instructions before the end.
>
> + Insntype* ip = reinterpret_cast<Insntype*>(input_view + span_start);
> + unsigned int insn_index = 0;
> + for (section_size_type i = span_start; i < span_end;
> + i += Insn_utilities::BYTES_PER_INSN, ++insn_index)
> + {
> + if (span_end - i < 3 * Insn_utilities::BYTES_PER_INSN)
> + break;
>
> Should be "<=".
>
> Given the constraints on the address, it's inefficient to loop over
> every address. I'd suggest instead something like this:
>
> if (output_address & 0x03 != 0)
> return;
> section_size_type offset = 0;
> section_size_type span_length = span_end - span_start;
> // The first instruction must be at page offset 0xFF8 or 0xFFC.
> unsigned int page_offset = output_address & 0xfff;
> if (page_offset < 0xff8)
> offset = 0xff8 - page_offset;
> while (offset + 3 * Insn_utilities::BYTES_PER_INSN <= span_length)
> {
> Insntype* ip = reinterpret_cast<Insntype*>(input_view + offset);
> Insntype insn1 = ip[0];
> if (!Insn_utilities::is_adrp(insn1))
> continue;
> Insntype insn2 = ip[1];
> Insntype insn3 = ip[2];
>
> // ...
>
> // Advance to next candidate instruction.
> // We only consider instruction sequences starting at
> // a page offset of 0xff8 or 0xffc.
> page_offset = (output_address + offset) & 0xfff;
> if (page_offset == 0xff8)
> offset += 4;
> else // (page_offset == 0xffc)
> offset += 0xffc;
> }
>
> (I used '4' instead of Insn_utilities::BYTES_PER_INSN here, because we
> explicitly want to advance from 0xff8 to 0xffc.)
>
> + if (is_erratum_843419_sequence(insn1, insn2, insn3))
> + do_report = true;
> + else if (span_end - i > 16)
>
> Should be '>='. Do you want "4 * Insn_utilities::BYTES_PER_INSN"
> instead of 16? (I'm not sold on the usefulness of the symbolic constant;
> it seems we're pretty hard-wired to a value of 4.)
>
> + return ;
>
> Extra space.
>
> + DEFINE_bool(fix_cortex_a53, options::TWO_DASHES, '\0', false,
> + N_("(AArch64 only) Scan and fix binaries for Cortex-A53 erratums."),
> + N_("(AArch64 only) Do not scan for Cortex-A53 erratums."));
>
> The plural of "erratum" is "errata".
>
> -cary
--
Han Shen
diff --git a/gold/aarch64.cc b/gold/aarch64.cc
index dea64c0..f0b7ca8 100644
--- a/gold/aarch64.cc
+++ b/gold/aarch64.cc
@@ -21,10 +21,11 @@
// MA 02110-1301, USA.
#include "gold.h"
#include <cstring>
+#include <map>
#include "elfcpp.h"
#include "dwarf.h"
#include "parameters.h"
#include "reloc.h"
@@ -63,10 +64,277 @@ template<int size, bool big_endian>
class Target_aarch64;
template<int size, bool big_endian>
class AArch64_relocate_functions;
+// Utility class dealing with insns. This is ported from macros in
+// bfd/elfnn-aarch64.cc, but wrapped inside a class as static members. This
+// class is used in erratum sequence scanning.
+
+template<bool big_endian>
+class AArch64_insn_utilities
+{
+public:
+ typedef typename elfcpp::Swap<32, big_endian>::Valtype Insntype;
+
+ static const int BYTES_PER_INSN = 4;
+
+ static unsigned int
+ aarch64_bit(Insntype insn, int pos)
+ { return ((1 << pos) & insn) >> pos; }
+
+ static unsigned int
+ aarch64_bits(Insntype insn, int pos, int l)
+ { return (insn >> pos) & ((1 << l) - 1); }
+
+ static bool
+ is_adrp(const Insntype insn)
+ { return (insn & 0x9F000000) == 0x90000000; }
+
+ static unsigned int
+ aarch64_rm(const Insntype insn)
+ { return aarch64_bits(insn, 16, 5); }
+
+ static unsigned int
+ aarch64_rn(const Insntype insn)
+ { return aarch64_bits(insn, 5, 5); }
+
+ static unsigned int
+ aarch64_rd(const Insntype insn)
+ { return aarch64_bits(insn, 0, 5); }
+
+ static unsigned int
+ aarch64_rt(const Insntype insn)
+ { return aarch64_bits(insn, 0, 5); }
+
+ static unsigned int
+ aarch64_rt2(const Insntype insn)
+ { return aarch64_bits(insn, 10, 5); }
+
+ static bool
+ aarch64_b(const Insntype insn)
+ { return (insn & 0xFC000000) == 0x14000000; }
+
+ static bool
+ aarch64_bl(const Insntype insn)
+ { return (insn & 0xFC000000) == 0x94000000; }
+
+ static bool
+ aarch64_blr(const Insntype insn)
+ { return (insn & 0xFFFFFC1F) == 0xD63F0000; }
+
+ static bool
+ aarch64_br(const Insntype insn)
+ { return (insn & 0xFFFFFC1F) == 0xD61F0000; }
+
+ // All ld/st ops. See C4-182 of the ARM ARM. The encoding space for
+ // LD_PCREL, LDST_RO, LDST_UI and LDST_UIMM cover prefetch ops.
+ static bool
+ aarch64_ld(Insntype insn) { return aarch64_bit(insn, 22) == 1; }
+
+ static bool
+ aarch64_ldst(Insntype insn)
+ { return (insn & 0x0a000000) == 0x08000000; }
+
+ static bool
+ aarch64_ldst_ex(Insntype insn)
+ { return (insn & 0x3f000000) == 0x08000000; }
+
+ static bool
+ aarch64_ldst_pcrel(Insntype insn)
+ { return (insn & 0x3b000000) == 0x18000000; }
+
+ static bool
+ aarch64_ldst_nap(Insntype insn)
+ { return (insn & 0x3b800000) == 0x28000000; }
+
+ static bool
+ aarch64_ldstp_pi(Insntype insn)
+ { return (insn & 0x3b800000) == 0x28800000; }
+
+ static bool
+ aarch64_ldstp_o(Insntype insn)
+ { return (insn & 0x3b800000) == 0x29000000; }
+
+ static bool
+ aarch64_ldstp_pre(Insntype insn)
+ { return (insn & 0x3b800000) == 0x29800000; }
+
+ static bool
+ aarch64_ldst_ui(Insntype insn)
+ { return (insn & 0x3b200c00) == 0x38000000; }
+
+ static bool
+ aarch64_ldst_piimm(Insntype insn)
+ { return (insn & 0x3b200c00) == 0x38000400; }
+
+ static bool
+ aarch64_ldst_u(Insntype insn)
+ { return (insn & 0x3b200c00) == 0x38000800; }
+
+ static bool
+ aarch64_ldst_preimm(Insntype insn)
+ { return (insn & 0x3b200c00) == 0x38000c00; }
+
+ static bool
+ aarch64_ldst_ro(Insntype insn)
+ { return (insn & 0x3b200c00) == 0x38200800; }
+
+ static bool
+ aarch64_ldst_uimm(Insntype insn)
+ { return (insn & 0x3b000000) == 0x39000000; }
+
+ static bool
+ aarch64_ldst_simd_m(Insntype insn)
+ { return (insn & 0xbfbf0000) == 0x0c000000; }
+
+ static bool
+ aarch64_ldst_simd_m_pi(Insntype insn)
+ { return (insn & 0xbfa00000) == 0x0c800000; }
+
+ static bool
+ aarch64_ldst_simd_s(Insntype insn)
+ { return (insn & 0xbf9f0000) == 0x0d000000; }
+
+ static bool
+ aarch64_ldst_simd_s_pi(Insntype insn)
+ { return (insn & 0xbf800000) == 0x0d800000; }
+
+ // Classify an INSN if it is indeed a load/store. Return true if INSN is a
+ // LD/ST instruction otherwise return false. For scalar LD/ST instructions
+ // PAIR is FALSE, RT is returned and RT2 is set equal to RT. For LD/ST pair
+ // instructions PAIR is TRUE, RT and RT2 are returned.
+ static bool
+ aarch64_mem_op_p(Insntype insn, unsigned int *rt, unsigned int *rt2,
+ bool *pair, bool *load)
+ {
+ uint32_t opcode;
+ unsigned int r;
+ uint32_t opc = 0;
+ uint32_t v = 0;
+ uint32_t opc_v = 0;
+
+ /* Bail out quickly if INSN doesn't fall into the the load-store
+ encoding space. */
+ if (!aarch64_ldst (insn))
+ return false;
+
+ *pair = false;
+ *load = false;
+ if (aarch64_ldst_ex (insn))
+ {
+ *rt = aarch64_rt (insn);
+ *rt2 = *rt;
+ if (aarch64_bit (insn, 21) == 1)
+ {
+ *pair = true;
+ *rt2 = aarch64_rt2 (insn);
+ }
+ *load = aarch64_ld (insn);
+ return true;
+ }
+ else if (aarch64_ldst_nap (insn)
+ || aarch64_ldstp_pi (insn)
+ || aarch64_ldstp_o (insn)
+ || aarch64_ldstp_pre (insn))
+ {
+ *pair = true;
+ *rt = aarch64_rt (insn);
+ *rt2 = aarch64_rt2 (insn);
+ *load = aarch64_ld (insn);
+ return true;
+ }
+ else if (aarch64_ldst_pcrel (insn)
+ || aarch64_ldst_ui (insn)
+ || aarch64_ldst_piimm (insn)
+ || aarch64_ldst_u (insn)
+ || aarch64_ldst_preimm (insn)
+ || aarch64_ldst_ro (insn)
+ || aarch64_ldst_uimm (insn))
+ {
+ *rt = aarch64_rt (insn);
+ *rt2 = *rt;
+ if (aarch64_ldst_pcrel (insn))
+ *load = true;
+ opc = aarch64_bits (insn, 22, 2);
+ v = aarch64_bit (insn, 26);
+ opc_v = opc | (v << 2);
+ *load = (opc_v == 1 || opc_v == 2 || opc_v == 3
+ || opc_v == 5 || opc_v == 7);
+ return true;
+ }
+ else if (aarch64_ldst_simd_m (insn)
+ || aarch64_ldst_simd_m_pi (insn))
+ {
+ *rt = aarch64_rt (insn);
+ *load = aarch64_bit (insn, 22);
+ opcode = (insn >> 12) & 0xf;
+ switch (opcode)
+ {
+ case 0:
+ case 2:
+ *rt2 = *rt + 3;
+ break;
+
+ case 4:
+ case 6:
+ *rt2 = *rt + 2;
+ break;
+
+ case 7:
+ *rt2 = *rt;
+ break;
+
+ case 8:
+ case 10:
+ *rt2 = *rt + 1;
+ break;
+
+ default:
+ return false;
+ }
+ return true;
+ }
+ else if (aarch64_ldst_simd_s (insn)
+ || aarch64_ldst_simd_s_pi (insn))
+ {
+ *rt = aarch64_rt (insn);
+ r = (insn >> 21) & 1;
+ *load = aarch64_bit (insn, 22);
+ opcode = (insn >> 13) & 0x7;
+ switch (opcode)
+ {
+ case 0:
+ case 2:
+ case 4:
+ *rt2 = *rt + r;
+ break;
+
+ case 1:
+ case 3:
+ case 5:
+ *rt2 = *rt + (r == 0 ? 2 : 3);
+ break;
+
+ case 6:
+ *rt2 = *rt + r;
+ break;
+
+ case 7:
+ *rt2 = *rt + (r == 0 ? 2 : 3);
+ break;
+
+ default:
+ return false;
+ }
+ return true;
+ }
+ return false;
+ }
+};
+
+
// Output_data_got_aarch64 class.
template<int size, bool big_endian>
class Output_data_got_aarch64 : public Output_data_got<size, big_endian>
{
@@ -914,11 +1182,18 @@ class AArch64_relobj : public Sized_relobj_file<size, big_endian>
{
gold_assert(shndx < this->stub_tables_.size());
this->stub_tables_[shndx] = stub_table;
}
- // Scan all relocation sections for stub generation.
+ // Entrance to erratum_843419 scanning.
+ void
+ scan_erratum_843419(unsigned int shndx,
+ const elfcpp::Shdr<size, big_endian>&,
+ Output_section*, const Symbol_table*,
+ The_target_aarch64*);
+
+ // Scan all relocation sections for stub generation.
void
scan_sections_for_stubs(The_target_aarch64*, const Symbol_table*,
const Layout*);
// Whether a section is a scannable text section.
@@ -933,10 +1208,34 @@ class AArch64_relobj : public Sized_relobj_file<size, big_endian>
// The stubs have relocations and we need to process them after writing
// out the stubs. So relocation now must follow section write.
this->set_relocs_must_follow_section_writes();
}
+ // Structure for mapping symbol position.
+ struct Mapping_symbol_position
+ {
+ Mapping_symbol_position(unsigned int shndx, AArch64_address offset):
+ shndx_(shndx), offset_(offset)
+ {}
+
+ // "<" comparator used in ordered_map container.
+ bool
+ operator<(const Mapping_symbol_position& p) const
+ {
+ return (this->shndx_ < p.shndx_
+ || (this->shndx_ == p.shndx_ && this->offset_ < p.offset_));
+ }
+
+ // Section index.
+ unsigned int shndx_;
+
+ // Section offset.
+ AArch64_address offset_;
+ };
+
+ typedef std::map<Mapping_symbol_position, char> Mapping_symbol_info;
+
protected:
// Post constructor setup.
void
do_setup()
{
@@ -951,22 +1250,111 @@ class AArch64_relobj : public Sized_relobj_file<size, big_endian>
do_relocate_sections(
const Symbol_table* symtab, const Layout* layout,
const unsigned char* pshdrs, Output_file* of,
typename Sized_relobj_file<size, big_endian>::Views* pviews);
+ // Count local symbols and (optionally) record mapping info.
+ virtual void
+ do_count_local_symbols(Stringpool_template<char>*,
+ Stringpool_template<char>*);
+
private:
// Whether a section needs to be scanned for relocation stubs.
bool
section_needs_reloc_stub_scanning(const elfcpp::Shdr<size, big_endian>&,
const Relobj::Output_sections&,
const Symbol_table*, const unsigned char*);
// List of stub tables.
Stub_table_list stub_tables_;
+
+ // Mapping symbol information sorted by (section index, section_offset).
+ Mapping_symbol_info mapping_symbol_info_;
}; // End of AArch64_relobj
+// Override to record mapping symbol information.
+template<int size, bool big_endian>
+void
+AArch64_relobj<size, big_endian>::do_count_local_symbols(
+ Stringpool_template<char>* pool, Stringpool_template<char>* dynpool)
+{
+ Sized_relobj_file<size, big_endian>::do_count_local_symbols(pool, dynpool);
+
+ // Only erratum-fixing work needs mapping symbols, so skip this time consuming
+ // processing if not fixing erratum.
+ if (!parameters->options().fix_cortex_a53())
+ return;
+
+ const unsigned int loccount = this->local_symbol_count();
+ if (loccount == 0)
+ return;
+
+ // Read the symbol table section header.
+ const unsigned int symtab_shndx = this->symtab_shndx();
+ elfcpp::Shdr<size, big_endian>
+ symtabshdr(this, this->elf_file()->section_header(symtab_shndx));
+ gold_assert(symtabshdr.get_sh_type() == elfcpp::SHT_SYMTAB);
+
+ // Read the local symbols.
+ const int sym_size =elfcpp::Elf_sizes<size>::sym_size;
+ gold_assert(loccount == symtabshdr.get_sh_info());
+ off_t locsize = loccount * sym_size;
+ const unsigned char* psyms = this->get_view(symtabshdr.get_sh_offset(),
+ locsize, true, true);
+
+ // For mapping symbol processing, we need to read the symbol names.
+ unsigned int strtab_shndx = this->adjust_shndx(symtabshdr.get_sh_link());
+ if (strtab_shndx >= this->shnum())
+ {
+ this->error(_("invalid symbol table name index: %u"), strtab_shndx);
+ return;
+ }
+
+ elfcpp::Shdr<size, big_endian>
+ strtabshdr(this, this->elf_file()->section_header(strtab_shndx));
+ if (strtabshdr.get_sh_type() != elfcpp::SHT_STRTAB)
+ {
+ this->error(_("symbol table name section has wrong type: %u"),
+ static_cast<unsigned int>(strtabshdr.get_sh_type()));
+ return;
+ }
+
+ const char* pnames =
+ reinterpret_cast<const char*>(this->get_view(strtabshdr.get_sh_offset(),
+ strtabshdr.get_sh_size(),
+ false, false));
+
+ // Skip the first dummy symbol.
+ psyms += sym_size;
+ typename Sized_relobj_file<size, big_endian>::Local_values*
+ plocal_values = this->local_values();
+ for (unsigned int i = 1; i < loccount; ++i, psyms += sym_size)
+ {
+ elfcpp::Sym<size, big_endian> sym(psyms);
+ Symbol_value<size>& lv((*plocal_values)[i]);
+ AArch64_address input_value = lv.input_value();
+
+ // Check to see if this is a mapping symbol.
+ const char* sym_name = pnames + sym.get_st_name();
+ if (sym_name[0] == '$' && (sym_name[1] == 'x' || sym_name[1] == 'd')
+ && sym_name[2] == '\0')
+ {
+ bool is_ordinary;
+ unsigned int input_shndx =
+ this->adjust_sym_shndx(i, sym.get_st_shndx(), &is_ordinary);
+ gold_assert(is_ordinary);
+
+ Mapping_symbol_position msp(input_shndx, input_value);
+ // Insert mapping_symbol_info into map whose ordering is defined by
+ // (shndx, offset_within_section).
+ this->mapping_symbol_info_[msp] = sym_name[1];
+ }
+ }
+}
+
+
// Relocate sections.
template<int size, bool big_endian>
void
AArch64_relobj<size, big_endian>::do_relocate_sections(
@@ -1103,10 +1491,74 @@ AArch64_relobj<size, big_endian>::section_needs_reloc_stub_scanning(
return this->text_section_is_scannable(text_shdr, text_shndx,
out_sections[text_shndx], symtab);
}
+// Scan section SHNDX for erratum 843419.
+
+template<int size, bool big_endian>
+void
+AArch64_relobj<size, big_endian>::scan_erratum_843419(
+ unsigned int shndx, const elfcpp::Shdr<size, big_endian>& shdr,
+ Output_section* os, const Symbol_table* symtab,
+ The_target_aarch64* target)
+{
+ if (shdr.get_sh_size() == 0
+ || (shdr.get_sh_flags() &
+ (elfcpp::SHF_ALLOC | elfcpp::SHF_EXECINSTR)) == 0
+ || shdr.get_sh_type() != elfcpp::SHT_PROGBITS)
+ return;
+
+ if (!os || symtab->is_section_folded(this, shndx)) return;
+
+ AArch64_address output_offset = this->get_output_section_offset(shndx);
+ AArch64_address output_address;
+ if (output_offset != invalid_address)
+ output_address = os->address() + output_offset;
+ else
+ {
+ const Output_relaxed_input_section* poris =
+ os->find_relaxed_input_section(this, shndx);
+ if (!poris) return;
+ output_address = poris->address();
+ }
+
+ section_size_type input_view_size = 0;
+ const unsigned char* input_view =
+ this->section_contents(shndx, &input_view_size, false);
+
+ Mapping_symbol_position section_start(shndx, 0);
+ // Find the first mapping symbol record within section shndx.
+ typename Mapping_symbol_info::const_iterator p =
+ this->mapping_symbol_info_.lower_bound(section_start);
+ if (p == this->mapping_symbol_info_.end() || p->first.shndx_ != shndx)
+ gold_warning(_("cannot scan executable section %u of %s for Cortex-A53 "
+ "erratum because it has no mapping symbols."),
+ shndx, this->name().c_str());
+ while (p != this->mapping_symbol_info_.end() &&
+ p->first.shndx_ == shndx)
+ {
+ typename Mapping_symbol_info::const_iterator prev = p;
+ ++p;
+ if (prev->second == 'x')
+ {
+ section_size_type span_start =
+ convert_to_section_size_type(prev->first.offset_);
+ section_size_type span_end;
+ if (p != this->mapping_symbol_info_.end()
+ && p->first.shndx_ == shndx)
+ span_end = convert_to_section_size_type(p->first.offset_);
+ else
+ span_end = convert_to_section_size_type(shdr.get_sh_size());
+ target->scan_erratum_843419_span(
+ this, shndx, span_start, span_end,
+ const_cast<unsigned char*>(input_view), output_address);
+ }
+ }
+}
+
+
// Scan relocations for stub generation.
template<int size, bool big_endian>
void
AArch64_relobj<size, big_endian>::scan_sections_for_stubs(
@@ -1136,10 +1588,12 @@ AArch64_relobj<size, big_endian>::scan_sections_for_stubs(
// Do relocation stubs scanning.
const unsigned char* p = pshdrs + shdr_size;
for (unsigned int i = 1; i < shnum; ++i, p += shdr_size)
{
const elfcpp::Shdr<size, big_endian> shdr(p);
+ if (parameters->options().fix_cortex_a53())
+ scan_erratum_843419(i, shdr, out_sections[i], symtab, target);
if (this->section_needs_reloc_stub_scanning(shdr, out_sections, symtab,
pshdrs))
{
unsigned int index = this->adjust_shndx(shdr.get_sh_info());
AArch64_address output_offset =
@@ -1642,10 +2096,11 @@ class Target_aarch64 : public Sized_target<size, big_endian>
typedef AArch64_input_section<size, big_endian> The_aarch64_input_section;
typedef AArch64_output_section<size, big_endian> The_aarch64_output_section;
typedef Unordered_map<Section_id,
AArch64_input_section<size, big_endian>*,
Section_id_hash> AArch64_input_section_map;
+ typedef AArch64_insn_utilities<big_endian> Insn_utilities;
const static int TCB_SIZE = size / 8 * 2;
Target_aarch64(const Target::Target_info* info = &aarch64_info)
: Sized_target<size, big_endian>(info),
got_(NULL), plt_(NULL), got_plt_(NULL), got_irelative_(NULL),
@@ -1833,10 +2288,21 @@ class Target_aarch64 : public Sized_target<size, big_endian>
&& parameters->target().get_size() == size
&& parameters->target().is_big_endian() == big_endian);
return static_cast<This*>(parameters->sized_target<size, big_endian>());
}
+
+ // Scan erratum for a part of a section.
+ void
+ scan_erratum_843419_span(
+ AArch64_relobj<size, big_endian>*,
+ unsigned int shndx,
+ const section_size_type,
+ const section_size_type,
+ unsigned char*,
+ Address);
+
protected:
void
do_select_as_default_target()
{
gold_assert(aarch64_reloc_property_table == NULL);
@@ -2126,10 +2592,16 @@ class Target_aarch64 : public Sized_target<size, big_endian>
{
gold_assert(this->plt_ != NULL);
return this->plt_;
}
+ // Return whether this is a 3-insn erratum sequence.
+ bool is_erratum_843419_sequence(
+ typename elfcpp::Swap<32,big_endian>::Valtype insn1,
+ typename elfcpp::Swap<32,big_endian>::Valtype insn2,
+ typename elfcpp::Swap<32,big_endian>::Valtype insn3);
+
// Get the dynamic reloc section, creating it if necessary.
Reloc_section*
rela_dyn_section(Layout*);
// Get the section to use for TLSDESC relocations.
@@ -2557,11 +3029,11 @@ Target_aarch64<size, big_endian>::scan_reloc_for_stub(
}
typename The_reloc_stub::Stub_type stub_type = The_reloc_stub::
stub_type_for_reloc(r_type, address, destination);
if (stub_type == The_reloc_stub::ST_NONE)
- return ;
+ return;
The_stub_table* stub_table = aarch64_relobj->stub_table(relinfo->data_shndx);
gold_assert(stub_table != NULL);
The_reloc_stub_key key(stub_type, gsym, aarch64_relobj, r_sym, addend);
@@ -6717,10 +7189,116 @@ Target_aarch64<size, big_endian>::relocate_relocs(
reloc_view,
reloc_view_size);
}
+// Return whether this is a 3-insn erratum sequence.
+
+template<int size, bool big_endian>
+bool
+Target_aarch64<size, big_endian>::is_erratum_843419_sequence(
+ typename elfcpp::Swap<32,big_endian>::Valtype insn1,
+ typename elfcpp::Swap<32,big_endian>::Valtype insn2,
+ typename elfcpp::Swap<32,big_endian>::Valtype insn3)
+{
+ unsigned rt1, rt2;
+ bool load, pair;
+
+ // The 2nd insn is a single register load or store; or register pair
+ // store.
+ if (Insn_utilities::aarch64_mem_op_p(insn2, &rt1, &rt2, &pair, &load)
+ && (!pair || (pair && !load)))
+ {
+ // The 3rd insn is a load or store instruction from the "Load/store
+ // register (unsigned immediate)" encoding class, using Rn as the
+ // base address register.
+ if (Insn_utilities::aarch64_ldst_uimm(insn3)
+ && (Insn_utilities::aarch64_rn(insn3)
+ == Insn_utilities::aarch64_rd(insn1)))
+ return true;
+ }
+ return false;
+}
+
+
+// Scan erratum for section SHNDX range
+// [output_address + span_start, output_address + span_end).
+
+template<int size, bool big_endian>
+void
+Target_aarch64<size, big_endian>::scan_erratum_843419_span(
+ AArch64_relobj<size, big_endian>* relobj,
+ unsigned int shndx,
+ const section_size_type span_start,
+ const section_size_type span_end,
+ unsigned char* input_view,
+ Address output_address)
+{
+ typedef typename Insn_utilities::Insntype Insntype;
+
+ if (((output_address + span_start) & 0x03) != 0)
+ return;
+
+ section_size_type offset = 0;
+ section_size_type span_length = span_end - span_start;
+ // The first instruction must be ending at 0xFF8 or 0xFFC.
+ unsigned int page_offset = (output_address + span_start) & 0xFFF;
+ // Make sure starting position, that is "output_address+span_start+offset",
+ // starts at page position 0xFF8 or 0xFFC.
+ if (page_offset < 0xFF8)
+ offset = 0xFF8 - page_offset;
+ while (offset + 3 * Insn_utilities::BYTES_PER_INSN <= span_length)
+ {
+ Insntype* ip = reinterpret_cast<Insntype*>(input_view +
+ span_start + offset);
+ Insntype insn1 = ip[0];
+ if (Insn_utilities::is_adrp(insn1))
+ {
+ Insntype insn2 = ip[1];
+ Insntype insn3 = ip[2];
+ bool do_report = false;
+ if (is_erratum_843419_sequence(insn1, insn2, insn3))
+ do_report = true;
+ else if (offset + 4 * Insn_utilities::BYTES_PER_INSN <= span_length)
+ {
+ // Optionally we can have an insn between ins2 and ins3
+ Insntype insn_opt = ip[2];
+ // And insn_opt must not be a branch.
+ if (!Insn_utilities::aarch64_b(insn_opt)
+ && !Insn_utilities::aarch64_bl(insn_opt)
+ && !Insn_utilities::aarch64_blr(insn_opt)
+ && !Insn_utilities::aarch64_br(insn_opt))
+ {
+ // And insn_opt must not write to dest reg in insn1. However
+ // we do a conservative scan, which means we may fix/report
+ // more than necessary, but it doesn't hurt.
+
+ Insntype insn4 = ip[3];
+ if (is_erratum_843419_sequence(insn1, insn2, insn4))
+ do_report = true;
+ }
+ }
+ if (do_report)
+ {
+ gold_error(_("Erratum 943419 found at \"%s\", section %d, "
+ "offset 0x%08x."),
+ relobj->name().c_str(), shndx,
+ (unsigned int)(span_start + offset));
+ }
+ }
+
+ // Advance to next candidate instruction. We only consider instruction
+ // sequences starting at a page offset of 0xff8 or 0xffc.
+ page_offset = (output_address + span_start + offset) & 0XFFF;
+ if (page_offset == 0XFF8)
+ offset += 4;
+ else // (page_offset == 0xffc), we move to next page's 0xff8.
+ offset += 0XFFC;
+ }
+}
+
+
// The selector for aarch64 object files.
template<int size, bool big_endian>
class Target_selector_aarch64 : public Target_selector
{
diff --git a/gold/options.h b/gold/options.h
index 8c9c934..c1a5743 100644
--- a/gold/options.h
+++ b/gold/options.h
@@ -800,10 +800,14 @@ class General_options
DEFINE_bool(fix_cortex_a8, options::TWO_DASHES, '\0', false,
N_("(ARM only) Fix binaries for Cortex-A8 erratum."),
N_("(ARM only) Do not fix binaries for Cortex-A8 erratum."));
+ DEFINE_bool(fix_cortex_a53, options::TWO_DASHES, '\0', false,
+ N_("(AArch64 only) Scan and fix binaries for Cortex-A53 errata."),
+ N_("(AArch64 only) Do not scan for Cortex-A53 errata."));
+
DEFINE_bool(fix_arm1176, options::TWO_DASHES, '\0', true,
N_("(ARM only) Fix binaries for ARM1176 erratum."),
N_("(ARM only) Do not fix binaries for ARM1176 erratum."));
DEFINE_bool(merge_exidx_entries, options::TWO_DASHES, '\0', true,
diff --git a/gold/aarch64.cc b/gold/aarch64.cc
index 2313496..f0b7ca8 100644
--- a/gold/aarch64.cc
+++ b/gold/aarch64.cc
@@ -1217,11 +1217,11 @@ class AArch64_relobj : public Sized_relobj_file<size, big_endian>
shndx_(shndx), offset_(offset)
{}
// "<" comparator used in ordered_map container.
bool
- operator < (const Mapping_symbol_position& p) const
+ operator<(const Mapping_symbol_position& p) const
{
return (this->shndx_ < p.shndx_
|| (this->shndx_ == p.shndx_ && this->offset_ < p.offset_));
}
@@ -1281,11 +1281,11 @@ AArch64_relobj<size, big_endian>::do_count_local_symbols(
Sized_relobj_file<size, big_endian>::do_count_local_symbols(pool, dynpool);
// Only erratum-fixing work needs mapping symbols, so skip this time consuming
// processing if not fixing erratum.
if (!parameters->options().fix_cortex_a53())
- return ;
+ return;
const unsigned int loccount = this->local_symbol_count();
if (loccount == 0)
return;
@@ -1535,27 +1535,26 @@ AArch64_relobj<size, big_endian>::scan_erratum_843419(
"erratum because it has no mapping symbols."),
shndx, this->name().c_str());
while (p != this->mapping_symbol_info_.end() &&
p->first.shndx_ == shndx)
{
- typename Mapping_symbol_info::const_iterator next =
- this->mapping_symbol_info_.upper_bound(p->first);
- if (p->second == 'x')
+ typename Mapping_symbol_info::const_iterator prev = p;
+ ++p;
+ if (prev->second == 'x')
{
section_size_type span_start =
- convert_to_section_size_type(p->first.offset_);
+ convert_to_section_size_type(prev->first.offset_);
section_size_type span_end;
- if (next != this->mapping_symbol_info_.end()
- && next->first.shndx_ == shndx)
- span_end = convert_to_section_size_type(next->first.offset_);
+ if (p != this->mapping_symbol_info_.end()
+ && p->first.shndx_ == shndx)
+ span_end = convert_to_section_size_type(p->first.offset_);
else
span_end = convert_to_section_size_type(shdr.get_sh_size());
target->scan_erratum_843419_span(
this, shndx, span_start, span_end,
const_cast<unsigned char*>(input_view), output_address);
}
- p = next;
}
}
// Scan relocations for stub generation.
@@ -3030,11 +3029,11 @@ Target_aarch64<size, big_endian>::scan_reloc_for_stub(
}
typename The_reloc_stub::Stub_type stub_type = The_reloc_stub::
stub_type_for_reloc(r_type, address, destination);
if (stub_type == The_reloc_stub::ST_NONE)
- return ;
+ return;
The_stub_table* stub_table = aarch64_relobj->stub_table(relinfo->data_shndx);
gold_assert(stub_table != NULL);
The_reloc_stub_key key(stub_type, gsym, aarch64_relobj, r_sym, addend);
@@ -7210,13 +7209,13 @@ Target_aarch64<size, big_endian>::is_erratum_843419_sequence(
&& (!pair || (pair && !load)))
{
// The 3rd insn is a load or store instruction from the "Load/store
// register (unsigned immediate)" encoding class, using Rn as the
// base address register.
- if (Insn_utilities::aarch64_ldst_uimm(insn3) &&
- Insn_utilities::aarch64_rn(insn3)
- == Insn_utilities::aarch64_rd(insn1))
+ if (Insn_utilities::aarch64_ldst_uimm(insn3)
+ && (Insn_utilities::aarch64_rn(insn3)
+ == Insn_utilities::aarch64_rd(insn1)))
return true;
}
return false;
}
@@ -7234,63 +7233,69 @@ Target_aarch64<size, big_endian>::scan_erratum_843419_span(
unsigned char* input_view,
Address output_address)
{
typedef typename Insn_utilities::Insntype Insntype;
- // Erratum sequence is at least 3 insns.
- if (span_end - span_start < 3 * Insn_utilities::BYTES_PER_INSN)
- return ;
+ if (((output_address + span_start) & 0x03) != 0)
+ return;
- Insntype* ip = reinterpret_cast<Insntype*>(input_view + span_start);
- unsigned int insn_index = 0;
- for (section_size_type i = span_start; i < span_end;
- i += Insn_utilities::BYTES_PER_INSN, ++insn_index)
+ section_size_type offset = 0;
+ section_size_type span_length = span_end - span_start;
+ // The first instruction must be ending at 0xFF8 or 0xFFC.
+ unsigned int page_offset = (output_address + span_start) & 0xFFF;
+ // Make sure starting position, that is "output_address+span_start+offset",
+ // starts at page position 0xFF8 or 0xFFC.
+ if (page_offset < 0xFF8)
+ offset = 0xFF8 - page_offset;
+ while (offset + 3 * Insn_utilities::BYTES_PER_INSN <= span_length)
{
- if (span_end - i < 3 * Insn_utilities::BYTES_PER_INSN)
- break;
- Address address = output_address + i;
- // The first instruction must be ending at 0xFF8 or 0xFFC and it must
- // be an ADRP.
- Address a3 = (address & 0xFFF);
- if (a3 != 0xFF8 && a3 != 0xFFC)
- continue;
- Insntype insn1 = ip[insn_index];
- if (!Insn_utilities::is_adrp(insn1))
- continue;
-
- Insntype insn2 = ip[insn_index + 1];
- Insntype insn3 = ip[insn_index + 2];
- bool do_report = false;
- if (is_erratum_843419_sequence(insn1, insn2, insn3))
- do_report = true;
- else if (span_end - i > 16)
+ Insntype* ip = reinterpret_cast<Insntype*>(input_view +
+ span_start + offset);
+ Insntype insn1 = ip[0];
+ if (Insn_utilities::is_adrp(insn1))
{
- // Optionally we can have an insn between ins2 and ins3
- Insntype insn_opt = ip[insn_index + 2];
- // And insn_opt must not be a branch.
- if (Insn_utilities::aarch64_b(insn_opt)
- || Insn_utilities::aarch64_bl(insn_opt)
- || Insn_utilities::aarch64_blr(insn_opt)
- || Insn_utilities::aarch64_br(insn_opt))
- continue;
-
- // And insn_opt must not write to dest reg in insn1. However we do a
- // conservative scan, which means we may fix/report more than
- // necessary, but it doesn't hurt.
-
- Insntype insn4 = ip[insn_index + 3];
- if (is_erratum_843419_sequence(insn1, insn2, insn4))
+ Insntype insn2 = ip[1];
+ Insntype insn3 = ip[2];
+ bool do_report = false;
+ if (is_erratum_843419_sequence(insn1, insn2, insn3))
do_report = true;
+ else if (offset + 4 * Insn_utilities::BYTES_PER_INSN <= span_length)
+ {
+ // Optionally we can have an insn between ins2 and ins3
+ Insntype insn_opt = ip[2];
+ // And insn_opt must not be a branch.
+ if (!Insn_utilities::aarch64_b(insn_opt)
+ && !Insn_utilities::aarch64_bl(insn_opt)
+ && !Insn_utilities::aarch64_blr(insn_opt)
+ && !Insn_utilities::aarch64_br(insn_opt))
+ {
+ // And insn_opt must not write to dest reg in insn1. However
+ // we do a conservative scan, which means we may fix/report
+ // more than necessary, but it doesn't hurt.
+
+ Insntype insn4 = ip[3];
+ if (is_erratum_843419_sequence(insn1, insn2, insn4))
+ do_report = true;
+ }
+ }
+ if (do_report)
+ {
+ gold_error(_("Erratum 943419 found at \"%s\", section %d, "
+ "offset 0x%08x."),
+ relobj->name().c_str(), shndx,
+ (unsigned int)(span_start + offset));
+ }
}
- if (do_report)
- {
- gold_error(_("Erratum 943419 found at \"%s\", section %d, "
- "offset 0x%08x."),
- relobj->name().c_str(), shndx, (unsigned int)i);
- }
+
+ // Advance to next candidate instruction. We only consider instruction
+ // sequences starting at a page offset of 0xff8 or 0xffc.
+ page_offset = (output_address + span_start + offset) & 0XFFF;
+ if (page_offset == 0XFF8)
+ offset += 4;
+ else // (page_offset == 0xffc), we move to next page's 0xff8.
+ offset += 0XFFC;
}
- return ;
}
// The selector for aarch64 object files.
diff --git a/gold/options.h b/gold/options.h
index 6bf989c..c1a5743 100644
--- a/gold/options.h
+++ b/gold/options.h
@@ -801,12 +801,12 @@ class General_options
DEFINE_bool(fix_cortex_a8, options::TWO_DASHES, '\0', false,
N_("(ARM only) Fix binaries for Cortex-A8 erratum."),
N_("(ARM only) Do not fix binaries for Cortex-A8 erratum."));
DEFINE_bool(fix_cortex_a53, options::TWO_DASHES, '\0', false,
- N_("(AArch64 only) Scan and fix binaries for Cortex-A53 erratums."),
- N_("(AArch64 only) Do not scan for Cortex-A53 erratums."));
+ N_("(AArch64 only) Scan and fix binaries for Cortex-A53 errata."),
+ N_("(AArch64 only) Do not scan for Cortex-A53 errata."));
DEFINE_bool(fix_arm1176, options::TWO_DASHES, '\0', true,
N_("(ARM only) Fix binaries for ARM1176 erratum."),
N_("(ARM only) Do not fix binaries for ARM1176 erratum."));