[gold][aarch64]Patch for Relaxation

Cary Coutant ccoutant@google.com
Wed Oct 15 16:59:00 GMT 2014


+      return name_hash_value
+       | (stub_type_hash_value << 20)
+       | (r_sym_hash_value << 22)
+       | (addend_hash_value << 27);

Please add parentheses around the whole expression, then line up the
"|" operators just inside the open paren (standard Gnu style for
continued expressions).

My concern with the previous hash function wasn't about XOR'ing the
additional bits into the string hash, but with the fact that, e.g.,
r_sym, stub_type, and addend were all effectively XOR'ed together
without any multiplicative factor to keep them from canceling one
another out. Your new function avoids that, but drops all but the
lower 5 bits of addend and r_sym. I'm not sure what distribution
you're expecting, so it's not immediately clear to me how good this
hash is. If you go back to XOR, there's no need to mask or shift away
any of the bits of the string hash -- just do something to keep the
three other fields from canceling. One way is to mask and shift them
into non-overlapping positions, as you've done here, but you can keep
more of the bits; something like this:

    size_t name_hash_value = gold::string_hash<char>(...);
    return (name_hash_value
            ^ stub_type_hash_value
            ^ ((r_sym & 0x3fff) << 2)
            ^ ((addend & 0xffff) << 16));

A more robust approach would be to use a hash combine function.
Unfortunately, I don't think STL provides one, but Boost has
hash_combine(), and libiberty has an equivalent, iterative_hash()
(#include "hashtab.h"). You could use iterative_hash something like
this:

    hashval_t h = static_cast<hashval_t>(gold::string_hash<char>(...);
    h = iterative_hash_object(this->stub_type, h);
    h = iterative_hash_object(r_sym, h);
    h = iterative_hash_object(addend, h);
    return static_cast<size_t>(h);

It's only a 32-bit hash function, but that's probably sufficient. If
you don't want to throw away the full string_hash value, you could do
this instead:

    hashval_t h = 0;
    h = iterative_hash_object(this->stub_type, h);
    h = iterative_hash_object(r_sym, h);
    h = iterative_hash_object(addend, h);
    return (gold::string_hash<char>(...)
             ^ static_cast<size_t>(h);

Using iterative_hash might be noticeably slower than the simpler XOR
strategy, so I'm OK with either.

-cary


On Mon, Oct 13, 2014 at 1:31 PM, Hán Shěn (沈涵) <shenhan@google.com> wrote:
> Hi Cary, thanks. All done in below.
>
> Attached modified patch - relaxation.patch2. Also attached the diff
> from patch2 to patch1 for ease of review.
>
> Thanks,
> Han
>
> On Fri, Oct 10, 2014 at 4:28 PM, Cary Coutant <ccoutant@google.com> wrote:
>>
>> > Here we have the patch for gold aarch64 backend to support relaxation.
>> >
>> > In short relaxation is the linker's generation of stubs that fixes the
>> > out-of-range jumps/branches in the original object file.
>> >
>> > With this implementation, we are able to link a 456MB aarch64 application
>> > (correctness of the result file, though, hasn't been verified.)
>>
>> Thanks. Here are my comments...
>>
>> -cary
>>
>>
>> +      struct
>> +      {
>> + // For a global symbol, the symbol itself.
>> + Symbol* symbol;
>> +      } global;
>> +      struct
>> +      {
>> + // For a local symbol, the object defining object.
>>
>> I know this is actually unchanged code from before, but this
>> comment looks wrong. Should probably be "the defining object"
>> or "the object defining the symbol".
>
>
> Done
>>
>>
>> +  // Do not change the value of the enums, they are used to index into
>> +  // stub_insns array.
>>
>> In that case, you should probably assign values explicitly to each
>> enum constant. Enum constants like this should generally be all
>> upper case (like macros and static constants).
>
>
> Done - Changed to all upper case and explicitly assigned values to them.
>>
>>
>> +  // Determine the stub type for a certain relocation or st_none, if no stub is
>> +  // needed.
>> +  static Stub_type
>> +  stub_type_for_reloc(unsigned int r_type, AArch64_address address,
>> +      AArch64_address target);
>> +
>> + public:
>>
>> Unnecessary: everything above this was also public.
>
>
> Done
>>
>>
>> +  // The key class used to index the stub instance in the stub
>> table's stub map.
>> +  class Key
>> +  {
>> +  public:
>>
>> "public" should be indented one space.
>
>
> Done
>>
>>
>> +    // Return a hash value.
>> +    size_t
>> +    hash_value() const
>> +    {
>> +      return (this->stub_type_
>> +      ^ this->r_sym_
>> +      ^ gold::string_hash<char>(
>> +    (this->r_sym_ != Reloc_stub::invalid_index)
>> +    ? this->u_.relobj->name().c_str()
>> +    : this->u_.symbol->name())
>> +      ^ this->addend_);
>>
>> I know this is just copied from arm.cc, but it looks to me like
>> this hash function might be a bit weak. In particular,
>> stub_type_, r_sym_, and addend_ might easily cancel one another
>> out, causing some preventable collisions. r_sym_ is also
>> unnecessary (but harmless, I guess), when it's equal to
>> invalid_index. I'd suggest at least shifting stub_type_, r_sym_,
>> and addend_ by different amounts, or applying a different
>> multiplier to each to distribute the bits around more, and
>> applying r_sym_ only when it's a local symbol index.
>
>
> Done by placing name hash at bit 0-19, stub_type 20-21, r_sym 22-26
> (even if it's a invalid_index) and addend 27-31.
>>
>>
>> +    struct equal_to
>> +    {
>> +      bool
>> +      operator()(const Key& k1, const Key& k2) const
>> +      { return k1.eq(k2); }
>> +    };
>> +
>> +  private:
>>
>> "private" should be indented one space.
>
>
> Done.
>>
>>
>> +    // Initialize look-up tables.
>> +    Stub_table_list empty_stub_table_list(this->shnum(), NULL);
>> +    this->stub_tables_.swap(empty_stub_table_list);
>>
>> It's simpler to just use resize(this->shnum()). The "swap" idiom
>> is useful when you need to release memory, but not necessary for
>> initialization.
>
> Done
>>
>>
>> +  // Call parent to relocate sections.
>> +  Sized_relobj_file<size, big_endian>::do_relocate_sections(symtab, layout,
>> +   pshdrs, of, pviews);
>>
>> Indentation is off by one.
>
>
> Done.
>>
>>
>> +  unsigned int reloc_size;
>> +  if (sh_type == elfcpp::SHT_RELA)
>> +    reloc_size = elfcpp::Elf_sizes<size>::rela_size;
>>
>> For REL sections, reloc_size will be left uninitialized here.
>> If you don't expect REL sections, you should assert that.
>
>
> Done.
>>
>>
>> +  const unsigned int shdr_size = elfcpp::Elf_sizes<size>::shdr_size;
>> +  const elfcpp::Shdr<size, big_endian> text_shdr(pshdrs + index * shdr_size);
>> +  return this->section_is_scannable(text_shdr, index,
>> +    out_sections[index], symtab);
>>
>> The naming doesn't really make it clear that
>> section_needs_reloc_stub_scanning is looking at the REL/RELA
>> section, while section_is_scannable is looking at the PROGBITS
>> section that is being relocated. You've used text_shdr for the
>> section header, so I'd suggest using text_shndx instead of index,
>> and text_section_is_scannable for the function name.
>
>
> Done.
>>
>>
>> +      // Currently this only happens for a relaxed section.
>> +      const Output_relaxed_input_section* poris =
>> +      out_sections[index]->find_relaxed_input_section(this, index);
>>
>> This continuation line needs indentation (4 spaces).
>
>
> Done.
>>
>>
>> +  // Get the section contents. This does work for the case in which
>> +  // we modify the contents of an input section.  We need to pass the
>> +  // output view under such circumstances.
>>
>> Should the comment read "This does *not* work ..."?
>>
>> I don't see you handling the case where you modify the contents,
>> so maybe that last sentence is not necessary?
>
>
> Done by removing the confusing last 2 sentences.
>>
>>
>> +// Create a stub group for input sections from BEGIN to END.  OWNER
>> +// points to the input section to be the owner a new stub table.
>>
>> "OWNER points to the input section that will be the owner of the
>> stub table."
>
>
> Done.
>>
>>
>> +  // Currently we convert ordinary input sections into relaxed sections only
>> +  // at this point but we may want to support creating relaxed input section
>> +  // very early.  So we check here to see if owner is already a relaxed
>> +  // section.
>> +  gold_assert(!owner->is_relaxed_input_section());
>> +
>> +  The_aarch64_input_section* input_section;
>> +  if (owner->is_relaxed_input_section())
>> +    {
>> +      input_section = static_cast<The_aarch64_input_section*>(
>> +  owner->relaxed_input_section());
>> +    }
>>
>> Given the assert above, this code can't be reached. If you want to
>> leave it in as a stub for the future, I'd suggest removing the above
>> assert, and replacing the then clause with gold_unreachable().
>
>
> Done by replacing gold_assert by gold_unreachable inside the if clause
> also updated the comments.
>>
>>
>> +  do
>> +    {
>> +      if (p->is_input_section() || p->is_relaxed_input_section())
>> + {
>> +  // The stub table information for input sections live
>> +  // in their objects.
>> +  The_aarch64_relobj* aarch64_relobj =
>> +      static_cast<The_aarch64_relobj*>(p->relobj());
>> +  aarch64_relobj->set_stub_table(p->shndx(), stub_table);
>> + }
>> +      prev_p = p++;
>> +    }
>> +  while (prev_p != end);
>>
>> Suggest renaming begin/end to maybe first/last? It took me a minute
>> to understand that "end" doesn't refer to the same "end" that an
>> iterator would test against, and that you intended this to be an
>> inclusive range.
>
>
> Done renaming.
>>
>>
>> Why not just "do { ... } while (p++ != last);"?
>
> Done.
>>
>>
>> +// Group input sections for stub generation.  GROUP_SIZE is roughly the limit
>> +// of stub groups.  We grow a stub group by adding input section until the
>> +// size is just below GROUP_SIZE.  The last input section will be converted
>> +// into a stub table.  If STUB_ALWAYS_AFTER_BRANCH is false, we also add
>> +// input section after the stub table, effectively double the group size.
>>
>> Do you mean "the last input section will be converted into a stub
>> table *owner*"?
>>
>> And "we also add input section*s* after the stub table, effectively
>> *doubling* the group size."
>
>
> Done.
>>
>>
>> + case HAS_STUB_SECTION:
>> +  // Adding this section makes the post stub-section group larger
>> +  // than GROUP_SIZE.
>> +  gold_assert(false);
>>
>> You can just use gold_unreachable() here.
>>
>> +  // ET_EXEC files are valid input for --just-symbols/-R,
>> +  // and we treat them as relocatable objects.
>>
>> For --just-symbols, shouldn't you use the base implementation
>> of do_make_elf_object?
>
>
> Done by invoking parent implementation.
>>
>>
>> +// This function scans a relocation sections for stub generation.
>>
>> s/sections/section/
>
>
> Done.
>>
>>
>> + // An error should never aroused in the above step. If so, please
>> + // An error should never arouse, it is an "_NC" relocation.
>>
>> "arise".
>>
>> +    gold_error(_("Stub is too far away, try use a smaller value "
>> + "for '--stub-group-size'. For example, 0x2000000."));
>>
>> Either "try using" or "try to use" or just "try a smaller value".
>
>
> Done.
>
>
>
> --
> Han Shen |  Software Engineer |  shenhan@google.com |  +1-650-440-3330



More information about the Binutils mailing list