[gold][aarch64]Patch for Relaxation
Hán Shěn (沈涵)
shenhan@google.com
Wed Oct 15 20:51:00 GMT 2014
Hi Cary, thanks. Attached the updated patch.
The only diff from patch2 is the "hash_value" function, which I
directly pasted below for ease of review.
--- a/gold/aarch64.cc
+++ b/gold/aarch64.cc
@@ -496,26 +496,20 @@ class Reloc_stub
// Return a hash value.
size_t
hash_value() const
{
- // This will occupy the lowest 20 bits in the final hash value.
- size_t name_hash_value = 0XFFFFFF & gold::string_hash<char>(
+ size_t name_hash_value = gold::string_hash<char>(
(this->r_sym_ != Reloc_stub::invalid_index)
? this->u_.relobj->name().c_str()
: this->u_.symbol->name());
- // We only have 4 stub types, so we allocate 2 bits for it in the final
- // hash value, and they will occupy bit 21 - 22.
+ // We only have 4 stub types.
size_t stub_type_hash_value = 0x03 & this->stub_type_;
- // 5 bits for r_sym_, even if it's a invalid_index.
- size_t r_sym_hash_value = this->r_sym_ & 0x1F;
- // 5 bits for addend_.
- size_t addend_hash_value = this->addend_ & 0x1F;
- return name_hash_value
- | (stub_type_hash_value << 20)
- | (r_sym_hash_value << 22)
- | (addend_hash_value << 27);
+ return (name_hash_value
+ ^ stub_type_hash_value
+ ^ ((this->r_sym_ & 0x3fff) << 2)
+ ^ ((this->addend_ & 0xffff) << 16));
}
// Functors for STL associative containers.
struct hash
{
On Wed, Oct 15, 2014 at 9:59 AM, Cary Coutant <ccoutant@google.com> wrote:
> + return name_hash_value
> + | (stub_type_hash_value << 20)
> + | (r_sym_hash_value << 22)
> + | (addend_hash_value << 27);
>
> Please add parentheses around the whole expression, then line up the
> "|" operators just inside the open paren (standard Gnu style for
> continued expressions).
>
> My concern with the previous hash function wasn't about XOR'ing the
> additional bits into the string hash, but with the fact that, e.g.,
> r_sym, stub_type, and addend were all effectively XOR'ed together
> without any multiplicative factor to keep them from canceling one
> another out. Your new function avoids that, but drops all but the
> lower 5 bits of addend and r_sym. I'm not sure what distribution
> you're expecting, so it's not immediately clear to me how good this
> hash is. If you go back to XOR, there's no need to mask or shift away
> any of the bits of the string hash -- just do something to keep the
> three other fields from canceling. One way is to mask and shift them
> into non-overlapping positions, as you've done here, but you can keep
> more of the bits; something like this:
>
> size_t name_hash_value = gold::string_hash<char>(...);
> return (name_hash_value
> ^ stub_type_hash_value
> ^ ((r_sym & 0x3fff) << 2)
> ^ ((addend & 0xffff) << 16));
I choose to use this straightforward and faster implementation.
>
> A more robust approach would be to use a hash combine function.
> Unfortunately, I don't think STL provides one, but Boost has
> hash_combine(), and libiberty has an equivalent, iterative_hash()
> (#include "hashtab.h"). You could use iterative_hash something like
> this:
>
> hashval_t h = static_cast<hashval_t>(gold::string_hash<char>(...);
> h = iterative_hash_object(this->stub_type, h);
> h = iterative_hash_object(r_sym, h);
> h = iterative_hash_object(addend, h);
> return static_cast<size_t>(h);
>
> It's only a 32-bit hash function, but that's probably sufficient. If
> you don't want to throw away the full string_hash value, you could do
> this instead:
>
> hashval_t h = 0;
> h = iterative_hash_object(this->stub_type, h);
> h = iterative_hash_object(r_sym, h);
> h = iterative_hash_object(addend, h);
> return (gold::string_hash<char>(...)
> ^ static_cast<size_t>(h);
>
> Using iterative_hash might be noticeably slower than the simpler XOR
> strategy, so I'm OK with either.
>
> -cary
>
>
> On Mon, Oct 13, 2014 at 1:31 PM, Hán Shěn (沈涵) <shenhan@google.com> wrote:
>> Hi Cary, thanks. All done in below.
>>
>> Attached modified patch - relaxation.patch2. Also attached the diff
>> from patch2 to patch1 for ease of review.
>>
>> Thanks,
>> Han
>>
>> On Fri, Oct 10, 2014 at 4:28 PM, Cary Coutant <ccoutant@google.com> wrote:
>>>
>>> > Here we have the patch for gold aarch64 backend to support relaxation.
>>> >
>>> > In short relaxation is the linker's generation of stubs that fixes the
>>> > out-of-range jumps/branches in the original object file.
>>> >
>>> > With this implementation, we are able to link a 456MB aarch64 application
>>> > (correctness of the result file, though, hasn't been verified.)
>>>
>>> Thanks. Here are my comments...
>>>
>>> -cary
>>>
>>>
>>> + struct
>>> + {
>>> + // For a global symbol, the symbol itself.
>>> + Symbol* symbol;
>>> + } global;
>>> + struct
>>> + {
>>> + // For a local symbol, the object defining object.
>>>
>>> I know this is actually unchanged code from before, but this
>>> comment looks wrong. Should probably be "the defining object"
>>> or "the object defining the symbol".
>>
>>
>> Done
>>>
>>>
>>> + // Do not change the value of the enums, they are used to index into
>>> + // stub_insns array.
>>>
>>> In that case, you should probably assign values explicitly to each
>>> enum constant. Enum constants like this should generally be all
>>> upper case (like macros and static constants).
>>
>>
>> Done - Changed to all upper case and explicitly assigned values to them.
>>>
>>>
>>> + // Determine the stub type for a certain relocation or st_none, if no stub is
>>> + // needed.
>>> + static Stub_type
>>> + stub_type_for_reloc(unsigned int r_type, AArch64_address address,
>>> + AArch64_address target);
>>> +
>>> + public:
>>>
>>> Unnecessary: everything above this was also public.
>>
>>
>> Done
>>>
>>>
>>> + // The key class used to index the stub instance in the stub
>>> table's stub map.
>>> + class Key
>>> + {
>>> + public:
>>>
>>> "public" should be indented one space.
>>
>>
>> Done
>>>
>>>
>>> + // Return a hash value.
>>> + size_t
>>> + hash_value() const
>>> + {
>>> + return (this->stub_type_
>>> + ^ this->r_sym_
>>> + ^ gold::string_hash<char>(
>>> + (this->r_sym_ != Reloc_stub::invalid_index)
>>> + ? this->u_.relobj->name().c_str()
>>> + : this->u_.symbol->name())
>>> + ^ this->addend_);
>>>
>>> I know this is just copied from arm.cc, but it looks to me like
>>> this hash function might be a bit weak. In particular,
>>> stub_type_, r_sym_, and addend_ might easily cancel one another
>>> out, causing some preventable collisions. r_sym_ is also
>>> unnecessary (but harmless, I guess), when it's equal to
>>> invalid_index. I'd suggest at least shifting stub_type_, r_sym_,
>>> and addend_ by different amounts, or applying a different
>>> multiplier to each to distribute the bits around more, and
>>> applying r_sym_ only when it's a local symbol index.
>>
>>
>> Done by placing name hash at bit 0-19, stub_type 20-21, r_sym 22-26
>> (even if it's a invalid_index) and addend 27-31.
>>>
>>>
>>> + struct equal_to
>>> + {
>>> + bool
>>> + operator()(const Key& k1, const Key& k2) const
>>> + { return k1.eq(k2); }
>>> + };
>>> +
>>> + private:
>>>
>>> "private" should be indented one space.
>>
>>
>> Done.
>>>
>>>
>>> + // Initialize look-up tables.
>>> + Stub_table_list empty_stub_table_list(this->shnum(), NULL);
>>> + this->stub_tables_.swap(empty_stub_table_list);
>>>
>>> It's simpler to just use resize(this->shnum()). The "swap" idiom
>>> is useful when you need to release memory, but not necessary for
>>> initialization.
>>
>> Done
>>>
>>>
>>> + // Call parent to relocate sections.
>>> + Sized_relobj_file<size, big_endian>::do_relocate_sections(symtab, layout,
>>> + pshdrs, of, pviews);
>>>
>>> Indentation is off by one.
>>
>>
>> Done.
>>>
>>>
>>> + unsigned int reloc_size;
>>> + if (sh_type == elfcpp::SHT_RELA)
>>> + reloc_size = elfcpp::Elf_sizes<size>::rela_size;
>>>
>>> For REL sections, reloc_size will be left uninitialized here.
>>> If you don't expect REL sections, you should assert that.
>>
>>
>> Done.
>>>
>>>
>>> + const unsigned int shdr_size = elfcpp::Elf_sizes<size>::shdr_size;
>>> + const elfcpp::Shdr<size, big_endian> text_shdr(pshdrs + index * shdr_size);
>>> + return this->section_is_scannable(text_shdr, index,
>>> + out_sections[index], symtab);
>>>
>>> The naming doesn't really make it clear that
>>> section_needs_reloc_stub_scanning is looking at the REL/RELA
>>> section, while section_is_scannable is looking at the PROGBITS
>>> section that is being relocated. You've used text_shdr for the
>>> section header, so I'd suggest using text_shndx instead of index,
>>> and text_section_is_scannable for the function name.
>>
>>
>> Done.
>>>
>>>
>>> + // Currently this only happens for a relaxed section.
>>> + const Output_relaxed_input_section* poris =
>>> + out_sections[index]->find_relaxed_input_section(this, index);
>>>
>>> This continuation line needs indentation (4 spaces).
>>
>>
>> Done.
>>>
>>>
>>> + // Get the section contents. This does work for the case in which
>>> + // we modify the contents of an input section. We need to pass the
>>> + // output view under such circumstances.
>>>
>>> Should the comment read "This does *not* work ..."?
>>>
>>> I don't see you handling the case where you modify the contents,
>>> so maybe that last sentence is not necessary?
>>
>>
>> Done by removing the confusing last 2 sentences.
>>>
>>>
>>> +// Create a stub group for input sections from BEGIN to END. OWNER
>>> +// points to the input section to be the owner a new stub table.
>>>
>>> "OWNER points to the input section that will be the owner of the
>>> stub table."
>>
>>
>> Done.
>>>
>>>
>>> + // Currently we convert ordinary input sections into relaxed sections only
>>> + // at this point but we may want to support creating relaxed input section
>>> + // very early. So we check here to see if owner is already a relaxed
>>> + // section.
>>> + gold_assert(!owner->is_relaxed_input_section());
>>> +
>>> + The_aarch64_input_section* input_section;
>>> + if (owner->is_relaxed_input_section())
>>> + {
>>> + input_section = static_cast<The_aarch64_input_section*>(
>>> + owner->relaxed_input_section());
>>> + }
>>>
>>> Given the assert above, this code can't be reached. If you want to
>>> leave it in as a stub for the future, I'd suggest removing the above
>>> assert, and replacing the then clause with gold_unreachable().
>>
>>
>> Done by replacing gold_assert by gold_unreachable inside the if clause
>> also updated the comments.
>>>
>>>
>>> + do
>>> + {
>>> + if (p->is_input_section() || p->is_relaxed_input_section())
>>> + {
>>> + // The stub table information for input sections live
>>> + // in their objects.
>>> + The_aarch64_relobj* aarch64_relobj =
>>> + static_cast<The_aarch64_relobj*>(p->relobj());
>>> + aarch64_relobj->set_stub_table(p->shndx(), stub_table);
>>> + }
>>> + prev_p = p++;
>>> + }
>>> + while (prev_p != end);
>>>
>>> Suggest renaming begin/end to maybe first/last? It took me a minute
>>> to understand that "end" doesn't refer to the same "end" that an
>>> iterator would test against, and that you intended this to be an
>>> inclusive range.
>>
>>
>> Done renaming.
>>>
>>>
>>> Why not just "do { ... } while (p++ != last);"?
>>
>> Done.
>>>
>>>
>>> +// Group input sections for stub generation. GROUP_SIZE is roughly the limit
>>> +// of stub groups. We grow a stub group by adding input section until the
>>> +// size is just below GROUP_SIZE. The last input section will be converted
>>> +// into a stub table. If STUB_ALWAYS_AFTER_BRANCH is false, we also add
>>> +// input section after the stub table, effectively double the group size.
>>>
>>> Do you mean "the last input section will be converted into a stub
>>> table *owner*"?
>>>
>>> And "we also add input section*s* after the stub table, effectively
>>> *doubling* the group size."
>>
>>
>> Done.
>>>
>>>
>>> + case HAS_STUB_SECTION:
>>> + // Adding this section makes the post stub-section group larger
>>> + // than GROUP_SIZE.
>>> + gold_assert(false);
>>>
>>> You can just use gold_unreachable() here.
>>>
>>> + // ET_EXEC files are valid input for --just-symbols/-R,
>>> + // and we treat them as relocatable objects.
>>>
>>> For --just-symbols, shouldn't you use the base implementation
>>> of do_make_elf_object?
>>
>>
>> Done by invoking parent implementation.
>>>
>>>
>>> +// This function scans a relocation sections for stub generation.
>>>
>>> s/sections/section/
>>
>>
>> Done.
>>>
>>>
>>> + // An error should never aroused in the above step. If so, please
>>> + // An error should never arouse, it is an "_NC" relocation.
>>>
>>> "arise".
>>>
>>> + gold_error(_("Stub is too far away, try use a smaller value "
>>> + "for '--stub-group-size'. For example, 0x2000000."));
>>>
>>> Either "try using" or "try to use" or just "try a smaller value".
>>
>>
>> Done.
>>
>>
>>
>> --
>> Han Shen | Software Engineer | shenhan@google.com | +1-650-440-3330
--
Han Shen | Software Engineer | shenhan@google.com | +1-650-440-3330
-------------- next part --------------
A non-text attachment was scrubbed...
Name: relaxation.patch3
Type: application/octet-stream
Size: 99972 bytes
Desc: not available
URL: <https://sourceware.org/pipermail/binutils/attachments/20141015/cdaa04f1/attachment.obj>
More information about the Binutils
mailing list